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Chapter 1 


Introduction 


During the past few years, the technology for formal specification and verification of commu- 
nication protocols has matured to the point where we believe that it now provides practical 
assistance for protocol design and validation. Several models for distributed systems in gen- 
eral and communication protocols in particular have been developed, and recent advances in- 
clude formal models that allow reasoning about untimed systems as well as timed systems, e.g., 
[AL92a, GSSL93, LV93a, LV93b]. 

In connection with these models a host of proof techniques have been developed for proving 
that one protocol implements another. One class of proof techniques is the simulation techniques 
(including refinement mappings, and forward and backward simulations) [AL91, GSSL93, Jon91, 
LV92, LV93a, LV93b]. 

In this work, we show how one approach to formal specification and verification of distributed 
systems—the live (timed) I/O automata of [GSSL93]—can be used to verify an important class 
of communication protocols—those for reliable at-most-once message delivery. 

Thus, the report has two main parts: first, the formal framework of [GSSL93] is presented 
and augmented with additional theory (including a new temporal logic). Second, we consider the 
verification example. The purpose of our work is to provide better understanding, documentation 
and proof for the relaible at-most-once message delivery protocols, and to test the adequacy of 
the formal framework. 


Formal Framework 


When formally developing new protocols or proving correctness of existing ones with respect 
to some specification, a stepwise approach is usually used: the specification is given in a very 
abstract manner in which abstract data types are used and where possibly no distributed struc- 
ture is present. In a series of development steps this specification is refined (or implemented) 
by introducing more low-level data types and by introducing a distributed view of the system, 
where different nodes (protocol entities) are connected by more or less reliable channels. 

By using a formal approach to systems specification, it is possible to prove formally that a 
low-level (concrete) protocol correctly implements the high-level (abstract) specification. Such 
a proof is performed by proving that each level in the step-wise development is correct with 
respect to (i.e., implements) the next more abstract level. This approach to verification implies 
that the task of proving correctness of a complicated protocol is split into more managerable 
subtasks, and this greatly reduces the complexity of the overall proof. 


The models of [GSSL93] for untimed and timed systems use an automaton (or state machine) 
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to express safety properties. A safety property ensures that the system never does anything 
wrong by specifying the steps the system is allowed to perform during execution. However, a 
safety requirement does not guarantee that the system does anything at all. For that purpose 
the models of [GSSL93] contain an extra liveness condition. The liveness condition restricts 
the long-term behavior of the system by specifying what must eventually happen. An example 
of a liveness condition is the requirement that each process in a parallel system be given fair 
chances to proceed. In timed systems it is furthermore possible to specify timing requirements 
like deadlines, response times, etc.. 

The models of [GSSL93] are entirely semantic: they describe an abstract view of how dis- 
tributed systems behave when executed. Thus, they do not offer any syntax for writing down 
objects of the models. Such a syntax is presented in this work: 


e For writing down the automaton part of the models we use a Pascal-like notation which 
makes our specifications look close to traditional ways of describing protocols for dis- 
tributed systems. 


e The liveness part of the models is specified using the language of an extended temporal logic 
that we develop. This approach has the advantage that parts of the proofs of correctness 
can be performed using rules of the logic. 


An important property of the models of [GSSL93] is that they are compositional. This means 
that each component (e.g., node) in a complex system can be specified separately and that 
we can implement each component separately and yet obtain an implementation of the entire 
system. This enables a modular approach to systems specification and verification. 


We test the adequacy of the models and proof techniques by formalizing two existing protocols 
for solving the at-most-once message delivery problem and showing how these protocols can be 
proved correct. 


The At-Most-Once Message Delivery Problem 


The at-most-once message delivery problem is that of delivering a sequence of messages submit- 
ted by a user at one location to a user at another location. Ideally, we would like to insist that 
all messages be delivered in the order in which they are sent, each exactly once, and that an 
acknowledgement be returned for each delivered message.! 

Unfortunately, it is expensive to achieve these goals in the presence of failures (e.g., node 
crashes). In fact, it is impossible to achieve them at all unless some change is made to the 
stable state (i.e., the state that survives a crash) each time a message is delivered. To permit 
less expensive solutions, we weaken the statement of the problem slightly. We allow some 
messages to be lost when a node crash occurs; however, no messages should otherwise be lost, 
and those messages that are delivered should not be reordered or duplicated. (The specification 
is weakened in this way because message loss is generally considered to be less damaging than 
duplicate delivery.) Now it is required that the user receive either an acknowledgement that the 
message has been delivered, or in the case of crashes, an indication that the message might have 
been lost. 

There are various ways to solve the at-most-once message delivery problem. All are based on 
the idea of tagging a message with an identifier and transmitting it repeatedly to overcome the 


‘Our definition of at-most-once message delivery is different from what some people call at-most-once message 
delivery in that we include acknowledgements and require messages to be delivered in order. 


unreliability of the channel. The receiver? keeps a stock of “good” identifiers that it has never 
accepted before; when it sees a message tagged with a good identifier, it accepts it, delivers 
it, and removes that identifier from the set. Otherwise, the receiver just discards the message, 
perhaps after acknowledging it. In order for the sender to be sure that its message will be 
delivered rather than discarded, it must tag the message with a good identifer. What makes 
the implementations tricky is that the receiver will be keeping track of at least some of its good 
identifiers in volatile (non-stable) memory, which gets lost in case the receiver node crashes. But 
the sender does not immediately learn about the crash, so it may go on using these identifers and 
thus transmit messages that the receiver will reject. Different protocols use different methods 
to keep the sender and the receiver more or less in agreement about what identifiers to use. 

A desirable property, which is not directly related to correctness, is that the implementations 
offer a way of cleaning up “old” information when this cannot affect the future behavior. 

In this work, we consider two protocols that are important in practice: the Clock-Based 
Protocol (which we call C) of Liskov, Shrira and Wroclawski [LSW91] and the Five-Packet 
Handshake Protocol (which we call H) of Belsnes [Bel76]. The latter is the standard protocol for 
setting up network connections, used in TCP, ISO TP-4, and many other transport protocols. 
It is sometimes called the three-way handshake, because only three packets are needed for 
message delivery; the additional packets are required for acknowledgement and cleaning up the 
state. The former protocol was developed as an example to show the usefulness of clocks in 
network protocols [Lis91] and has been implemented at M.I.T.. Both protocols are sufficiently 
complicated that formal specification and proof seem useful. 


Survey of the Example 


We express both protocols, H and C, as well as the formal specification 5S of the at-most-once 
message delivery problem, in terms of the models of [GSSL93]. 

Although the two protocols appear to be quite different, we have found that both can be 
expressed formally as implementations of a common Generic Protocol G, which, in turn, is an 
implementation of the problem specification. To prove that G implements the specification, for 
proof-technical reasons we introduce an additional level of abstraction, the Delayed-Decision 
Specification D. This is depicted in Figure 1.1. Introducing intermediate levels of abstraction, 
like G and D, is a general proof strategy that allows large, complicated proofs to be split into 
smaller and more managerable subproofs. 

The specification S is stated in the untimed model of [GSSL93] whereas the Clock-Based 
Protocol C uses the timed model. This apparent model inconsistency is resolved by considering 
S to be a timed system that does not put any constraints in real time. In [GSSL93] certain 
embedding results provide the formal basis for moving between the timed and untimed model. 

In this report we provide almost complete proofs of correctness. Some parts of the proofs 
are omitted however but we treat all different kinds of proofs and provide informal justification 
for the missing parts. 


Outline of the Report 


The report is structured as follows. In Part I we consider the formal framework: Chapter 2 
gives a brief introduction to the models of [GSSL93] and the embedding results. Chapters 3 and 


?We denote by “receiver” the protocol entity that is situated on the receiver node, and use phrases like “the 
user at the receiver end” to denote the user that communicates with the receiver. Correspondingly for “sender”. 
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Delayed-Decision Specification 


Generic Protocol 


Clock-Based Protocol Five-Packet Handshake 
Protocol 


Figure 1.1 


Overview of the levels of abstraction. 


4 describe the syntax we use for specifying systems: first, in Chapter 3, we define an extended 
temporal logic, and then, in Chapter 4, we specifically show how this temporal logic is used 
to specify liveness conditions. Chapter 5 describes the proof techniques we use when proving 
correctness of the protocols. These techniques are mainly taken from [GSSL93]. 

The remaining part of the report Part II deals with the at-most-once message delivery 
example. First, in Chapter 6, we present the formal specification 5 of the at-most-once message 
delivery problem. In Chapter 7 we present the Delayed-Decision Specification D and show that 
it correctly implements $. Chapters 8—10 then formally specify the G, H, and C levels and 
consider their correctness. 

Finally, in Chapter 11, we give concluding remarks. 

The report contains three appendices. Appendix A introduces some basic notation and 
should be read before the rest of the report. Appendix B and Appendix C contain proofs of 
certain results in the main parts of the report. 
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Part I 


The Formal Framework 


Chapter 2 


The Model 


To make this report self-contained, we give a brief presentation of the operational models for 
distributed systems that are developed in [GSSL93]. We give all formal definitions and results 
that are needed but refer to [GSSL93] for details about, e.g., proofs and for a more thorough 
treatment of the models. 

We first present the model for untimed systems. Then the model for timed systems is 
presented, and finally we show how an untimed system can be thought of as a timed system 
that allows time to pass arbitrarily. 


2.1 The Model for Untimed Systems 


The model for untimed systems, called live [/O automata, which is developed in [GSSL93] 
consists of an automaton part (or state machine), with a labeled transition relation, and a 
liveness condition. The automaton specifies the possible steps of the system, i.e., it specifies 
what is allowed to happen, thus, the safety of the system. The liveness condition restricts the 
long-term behavior of the system by specifying what must eventually happen. 

The liveness condition can be seen as a way of restricting the way the automaton is “executed” 
whenever it is working properly. A liveness condition for a system of two parallel processes might 
require that each component be given the possibility of making progress infinitely often. In this 
way executions where one component wishes to proceed but is never given a chance are ruled 
out. This kind of liveness is known as weak fairness and is implemented on a physical machine 
by executing the parallel processes on separate processors or by using a fair scheduler. In the 
examples in this work we will see examples of more complicated liveness requirements. 

As mentioned above the automaton part has a labeled transition relation. This means that 
each step of the automaton is labeled by a name, called an action. The set of actions are 
partitioned into external and internal actions, where only the external actions are visible from 
the environment. The model is event-based in the sense that communication between parallel 
components of a system or between system and environment is modeled by joint actions. That 
is, communication is modeled as the joint executions of steps labeled by the same action. Thus, 
the states cannot be observed. For this reason correctness is based on the sequences of external 
actions (called traces) that can occur when the system is working properly, i.e., when its liveness 
condition is satisfied. 

To express a notion of system vs. environment, the external actions are partitioned into in- 
put and output actions, i.e., an I/O distinction is introduced. Intuitively output (and internal) 
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actions are controlled by the system, and are thus called locally-controlled actions, whereas input 
actions are controlled by the environment of the system. Since a system cannot control its envi- 
ronment, live I/O automata are required to be environment-free which intuitively means that no 
matter which inputs the environment provides during execution, the system can perform locally- 
controlled actions and in this way satisfy its liveness condition. Thus, the environment-freedom 
requirement ensures that live I/O automaton do not have liveness conditions like: “sooner or 
later input a arrives”. 

The environment-freedom requirement also implies that the automaton part of a live I/O 
automaton must be input-enabled which means that the automaton should be able to receive 
any input in any state. 

Even though our live I/O automaton model is not as general as a model without I/O dis- 
tinction and the environment-freedom requirement, a large number of systems can be specified 
using this model. In particular many distributed systems have a clear distinction between the 
output from the system and the input from the environment, and furthermore such systems are 
usually designed to be able to receive input at any time since processes are usually connected 
by networks that are not capable of buffering messages. In [GSSL93] a technical justification of 
environment-freedom is offered. This justification deals with the fact that without I/O distinc- 
tion and environment-freedon, a trace-based correctness notion as the one mentioned above is 
not adequate in that it cannot form the base of a notion of implementation that corresponds to 
our intuition. Furthermore, there exists simpler proof techniques for live I1/O automata than for 
more general models. 

We first present the automaton part, called safe I/O automata. Then we add the liveness 
condition, discuss the notion of implementation, and state an important substitutivity property 
of the model. 


2.1.1 Safe I/O Automata 


Definition 2.1 (Safe I/O Automaton) 


A safe I/O automaton A consists of four components: 
e A set states(A) of states. 
e A nonempty set start(A) of start states (start(A) C states(A)). 


e An action signature sig( A) = (in(A), out(A), int(A)) of disjoint sets of input, output, and 
internal actions, respectively. Denote by ezt(A) the set in(A)U out(A) of external actions, 
by local(A) the set out(A) U int(A) of locally-controlled actions, and by acts(A) the set 
ext(A) U int(A) of actions. 


e A transition relation steps(A) C states(A) x acts(A) x states(A). The transition relation 
steps(A) must have the property that for each state s € states(A) and each input action 
a € in(A) there exists a state s’ € states( A) such that (s,a,s’) € steps(A). A is said to be 
input-enabled. 


An action a is enabled in a state s if there exists a state s’ such that (s,a,s’) is a step, Le., 
(s,a,s') € steps(A). A set A of actions is said to be enabled in state s if there exists an action 
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a € A such that ais enabled in s. An action or set of actions which is not enabled in a state s 
is said to be disabled in s. 


An execution fragment a of a safe 1/O automaton A is a (finite or infinite) sequence of alternating 
states and actions starting with a state and, if the execution fragment is finite, ending in a state 


Q = $9415), A959°*° 


where each (8;, @;41, 5:41) € steps(A). Denote by fstate(a) the first state of a and, if a is finite, 
denote by Istate(a) the last state of a. Furthermore, denote by frag"(A), frag’ (A), and frag(A) 
the sets of finite, infinite and all execution fragments of A, respectively. An execution is an 
execution fragment whose first state is a start state. Denote by exec*(A), exec’(A) and exec( A) 
the sets of finite, infinite and all execution of A, respectively. A state s of A is reachable if there 
exists a finite execution of A that ends in s. 

A finite execution fragment a, = 8 94@,5,---@,)5, of A and an execution fragment ay, = 
SnGn418n41°°: of A can be concatenated. In this case the concatenation, written a, ~ ao, is 
the execution fragment $9@151 -+*@nSnQn41$n41°°*. Clearly, a; ~ a2 is an execution iff a, is an 
execution. 

An execution fragment a, of A is a prefix of an execution fragment a2 of A, written a, < as, 
if either a, = a2 or a, is finite and there exists an execution fragment a} of A such that 
Ay = a,7 a). 

Let @ = 89418 ,@28)--- be an execution fragment. The length of a is the number of actions 
occurring in a. Thus, 


ja} 4 n if ais finite and ends in s, 
alo= . . . 
oo if a@ is infinite 


Define the ith prefix and ith suffix of a, for 0 <i < |al', as 


4 
al; = $9 E181 °° G8; 


la 4 Siig Sini°+: if i < fal 
‘ Sla| if a is finite and i = |a| 


The trace of an execution fragment a of A, written trace 4(a), or just trace(a) when A is clear, 
is the list obtained by restricting a to the set of external actions of A, i.e., trace(a) = af ext(A). 
For a set F’ of executions of A, denote by traces,(F), or just traces() when A is clear from 
context, the set of traces of the executions in &. We say that @ is a trace of A if there exists an 
execution a of A with trace(a) = 9. Denote by traces*( A), traces*(A) and traces( A) the sets of 
finite, infinite and all traces of A, respectively. Note, that a finite trace might be the trace of an 
infinite execution. Furthermore, for any list / of actions of A, define trace 4(1), or just trace(l) 
when A is clear from context, to be I [ ext(A). 


When specifying complex distributed systems, it is important to be able to specify each process 
separately and then obtain the specification of the entire system as the parallel composition of 
the specifications of the processes. This modular approach greatly reduces the complexity of 
specifying large systems. The parallel composition operator in this model uses a synchronization 
style where automata synchronize on their common actions and evolve independently on the 
others. It is required that each external action be under the control of at most one automaton, 


'The index i ranges over the natural numbers so if |a| = 00, then i < |a| is the same as i < |al. 
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thus, parallel composition is defined only for compatible safe I/O automata. Compatibility 
requires that each action be an output action of at most one safe I/O automaton. Furthermore, 
to avoid action name clashes, compatibility requires that internal action names be unique. 


Definition 2.2 (Parallel composition of safe I/O automata) 


Safe 1/O automata A,,..., Ay are compatible if for all 1 < i,j < N with i 47 


1. out(A;)M out(A;) = 0 
2. int(A;) MN acts(A;) = 0 
The parallel composition A, || --- || Ay of compatible safe I/O automata A;,...,Ay is the safe 


I/O automaton A such that 
1. states(A) = states(.A,) x --- x states( Ay) 
2. start(A) = start(A,) x «++ x start( Ay) 
3. out(A) = out(A,) U-+-U out( Ay) 
4. in(A) = (in(A,) U-+-U in(Ay)) \ out(A) 
5. int(A) = int(A,) U---U int( Ay) 
6. ((S1,---,5N),4,(5),..-,S/y)) € steps(A) iff for all 1 <i< N 


(a) if a@ € acts(A;) then (s;, a, s/) € steps(A;) 
(b) if a € acts(A;) then 5; = sj 


The executions of the parallel composition of compatible safe I/O automata A = Aj||...||An 
can be projected to the component automata. First, for any state s of A, denote by s/A; the 
state of A; obtained by projecting s to A;. Then, for any execution a of A denote by a[ A; the 
execution of A; obtained from a by projecting the states in a to A; and by removing each action 
not in acts(A;) together with the state preceding the action. 


Parallel composition is typically used to build complex systems based on simpler components. 
Some actions are meant to represent internal communications between the subcomponents of 
the complex system. The action hiding operator allows us to change some external actions into 
internal ones. 


Definition 2.3 (Action hiding) 


Let A be a safe I/O automaton and let A be a set of actions such that A C local(A). Then 
define A \ A to be the safe I/O automaton such that 


1. states( A \ A) = states(A) 
2. start(A \ A) = start(A) 
3. in(A \ A) = in(A) 
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4. out(A\ A) = out(A)\A 
5. int(A\ A) = int(A)UA 
6. steps(A \ A) = steps(A) 


The final operator on safe 1/O automata is action renaming. Several processes might be identical 
except for their actions’ names. A classical example is given by the processes of a token ring 
communication network. Such processes could be easily specified by first defining a generic 
process and then creating an instance for each process through renaming of the actions. Action 
renaming can also be used to resolve name clashes that lead to incompatibilities in Definition 2.2. 


Definition 2.4 (Action renaming) 


A mapping p from actions to actions is applicable to a safe I/O automaton A if it is injective 
and acts(A) C dom(p). Given a safe I/O automaton and a mapping p applicable to A, we define 
p(A) to be the safe I/O automaton such that 


1. states(p(A)) = states( A) 
2. start(p(A)) = start( A) 
) 


6. steps(p(A)) = {(s, pla).s!) | (8,4, 5") € steps( A)} 


2.1.2 Live I/O Automata 


We have now described the safety component of a live I/O automaton. The liveness condition 
should specify which executions of a safe I/O automaton are considered to represent a properly 
working system. For this reason a liveness condition, in this model, is a subset of the executions of 
the safe I/O automaton. However, a liveness condition is used to restrict the long-term behavior 
of a system, i.e., to specify what must happen sooner or later. Thus, any finite execution of 
the safe I/O automaton should have an extension in the liveness condition. In other words, no 
matter what the safe I/O automaton has done up to some time, there is still a way for it to 
behave properly according to the liveness condition. 

This definition of a liveness condition only ensures that the liveness condition does not 
introduce more safety than is already specified by the safe I/O automaton. It does not, however, 
capture the fact that a live I/O automaton must not constrain its environment. To express this 
idea (the environment-freedom condition) formally, we set up a game between the system and 
its environment, and the system is then environment-free if it can win the game no matter what 
moves the environment performs, i.e., if the system has a winning strategy. The environment 
moves by providing any finite number of input actions, and the system moves by performing a 
local step, i.e., a step labeled by a locally-controlled action, or by making no step (a L move). 
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The fact that the environment is allowed to provide any finite number of input actions at any 
move expresses that the environment can be arbitrarily but not infinitely fast compared to the 
system. Note also that the environment provides actions and not steps. This is because the 
environment has no control over the state of the system: the environment provides the action 
and the system decides which of the possible states it should reach in response. 

The behavior of the system during the game is determined by a strategy. A strategy is a 
pair (g, f) of functions, where g determines which state to reach in response to an input action, 
and f determines the moves of the system. The notion of strategy is formalized as follows. 


Definition 2.5 (Strategy) 
Consider any safe I/O automaton A. A strategy defined on A is a pair of functions (g, f) where 
g : evec*(A) x in(A) — states(A) and f : evec*(A) — (local(A) x states(A)) U {L} such that 

1. g(a,a) = s implies (Istate(a),a,s) € steps( A) 

2. f(a) = (a,s) implies (Istate(a),a,s) € steps( A) 


The moves of the environment during the game are represented as an infinite sequence 7, called 
an environment sequence, of input actions interleaved with infinitely many A symbols. The 
symbol A is used to represent the points at which the system is allowed to move. The occurrence 
of infinitely many A symbols in an environment sequence guarantees that each environment move 
consists of only finitely many input actions. 

Remember from the discussion above that after any finite execution the system should still 
have a way of behaving properly. This is reflected in the following definition of the outcome of 
a strategy. 


Definition 2.6 (Outcome of a strategy) 


Let A be a safe I/O automaton and (g, f) a strategy defined on A. Define an environment 
sequence for A to be any infinite sequence of symbols from in(A) U {A} with infinitely many 
occurrences of \. Then define Ri, 5), the neat-function induced by (g, f), as follows: for any 
finite execution a of A and any environment sequence 7 for A, 


(aas,Z’) if Z= AT’, f(a) = (a,s) 
Roy p(a,Z) = ¢ (a,Z’) iff =AZ', flaj=L 


(aas,Z’) iff =al', g(a,a)=s 


Let a be any finite execution of A and Z any environment sequence for A. The outcome sequence 
of (g, f) given a and T is the unique infinite sequence (a”,Z”),>0 that satisfies: 


e (a°, T°) = (a,Z) and 


e For all n > 0, (a",Z") = Ry p(ar',2"~"). 
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Note, that (@”),>0 forms a chain ordered by prefix. 


The outcome Ovg,7)(a,Z) of the strategy (g, f) given a and Z is the execution lim,,_.,, a”, where 
(a”,Z"),>0 is the outcome sequence of (g, f) given a and Z and the limit is taken under prefix 
ordering. 


It is easy to see that any outcome of a strategy is an execution of the safe I/O automaton. 
The concepts of strategies and outcomes are used to define formally the environment-freedom- 


property. 


Definition 2.7 (Environment-freedom) 


A pair (A, L), where A is a safe I/O automaton and L C exec( A), is environment-free if there 
exists a strategy (g, f) defined on A such that for any finite execution a of A and any environment 
sequence Z for A, the outcome O;,,;)(a,Z) is an element of L. The strategy (g, f) is called an 
environment-free strategy for (A, L). 


Clearly, if a pair (A, Z) is environment-free, then any finite execution of A has an extention in 
L. Finally we can present the notion of live [/O automaton. 


Definition 2.8 (Live I/O automata) 


A live I/O automaton is a pair (A, L) where A is a safe I/O automaton and L C exec(A) such 
that (A, Z) is environment-free. We refer to the executions in L as the live executions of (A, L). 
Similarly the traces in traces(L) are referred to as the live traces of (A, L). 


In Chapter 4 we will define some standard liveness conditions, like weak fairness, for safe I/O 
automata and show once and for all that the resulting pairs are environment-free. 


The operators on safe I/O automata can now be extended to live I/O automata. For parallel 
composition the liveness condition for a composed system consists of all those executions whose 
projection to the components yield live executions of the components. That corresponds to the 
intuitive idea that a composed system works properly if all components work properly. 


Definition 2.9 (Parallel composition of live I/O automata) 


Live I/O automata (Aj, L1),...,(An, Ly) are compatible if the safe I/O automata A,,..., Ay 
are compatible. 

The parallel composition (A;, L,) || --- || (An, Ly) of compatible live I/O automata (Aj, £1), 
...,(Ay, Ly) is defined to be the pair (A, L) where A = A, || --- || Ay and L = {a € exec( A) | 
a[A, € ly,...,a[Ay € Dy}. 


Definition 2.10 (Action hiding of live I/O automata) 
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Let (A, £) be a live I/O automaton and let A be a set of actions such that A C local( A). Then 
define (A, L)\ A to be the pair (A \ A, L). 


Definition 2.11 (Action renaming of live I/O automata) 


A mapping p from actions to actions is applicable to a live I/O automaton (A, L) if it is applicable 
to A. Let a be any execution of A. Define p(a) to be the sequence that results from replacing 
each occurrence of every action a in a by p(a). Given a live I/O automaton (A, L) and a mapping 


p applicable to (A, L), we define p((A, L)) to be the pair (p(A), p(L)).? 
| 


An important property of the operators is that they are closed for live I/O automata in the 
sense that they produce new live I/O automata. 


Proposition 2.12 (Closure of parallel composition) 


Let (Ay, 1;),...,(An, Ly) be compatible live I/O automata. Then (A,, 1,1) || --- || (An, Ly) is 
a live I/O automaton. 


Proposition 2.13 (Closure of action hiding) 
Let (A,L) be a live I/O automaton and let A C local(A). Then (A,L)\ A is a live I/O 


automaton. 


Proposition 2.14 (Closure of action renaming) 


Let (A, L) be a live I/O automaton and let p be a mapping applicable to (A, L). Then p((A, L)) 
is a live I/O automaton. 


2.1.3 Correctness 


The notion of correct implementation between live I/O automata is based on their live traces. 
A live I/O automaton (A, L) is said to correctly implement a live I/O automaton (B, M), with 
the same input and output actions, if all live traces of (A, L) are also live traces of (B,M). 
This correctness notion ensures that whatever (A, 1) does, (B,M) could have done the same. 
That is, (A, Z) does nothing wrong which in other words means that (A, L) satisfies the safety 
specified by (B,M). Furthermore, the correctness notion also guarantees that (A, /) in fact 
does something because the correctnotion is based on lve traces, i.e., traces where something 
“good” happens. 

Sometimes one is not interested in the liveness of a system and therefore specifies a system 
as a safe I/O automaton. One safe I/O automaton is said to safely implement a safe I/O 


? As notational convention we allow a function to be applied to subsets of elements from the domain of the 
function. The result is then the set obtained by applying the function to each element of the subset. Thus, 


p(L) = {o(D) |B € L}. 
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automaton B, with the same input and output actions, if all traces of A are also traces of B. 
This notion of safe implementation does not guarantee that A does anything at all. In fact, a 
safe I/O automaton A with one state, no local steps, and “self-loop” steps for each of its input 
actions, is a safe implementation of any safe I/O automaton with the same input and output 
actions. The notion of safe implementation trivially extends to live I/O automata. 


Definition 2.15 (Implementation relations) 


Given two live I/O automata (A, L) and (B, M) such that in( A) = in( B) and out(A) = out(B), 
define the following implementation relations: 


Safe: ACs B iff traces( A) C traces( B) 
Safe: (A,L)Cs(B,M) iff ACs B 

Correct: (A,L)C,(B,M) iff traces(L) C traces(M) 
| 


The symbol Cg indicates that this relation is based on Safe traces. Similarly Cy, is based on 
Live traces. All implementation relations are clearly preorders. 


2.1.4 Substitutivity 


An important property of the model is that it allows a modular approach to systems specification 
and verification. If, for instance, a system 5S is made up of several parallel components, it is 
possible to implement separately each component of S$ and yet obtain an implementation of S. 
This is usually referred to as the substitutivity of the implementation relations with respect to 
the parallel composition operator. Similar results exist for the other two operators as stated in 
the following proposition. 


Proposition 2.16 (Substitutivity) 
Let (Aj, £;),(Bi, Mi), t= 1,...,N, be live 1/0 automata with in(A;) = in(B;) and out(A;) = 
out(B;), and let Cx be one relation among Cg and Cy. If, for each i, (A;,L;) Cx (Bi, M;), 
then 
1. if (Ai, 11),...,(An, Ly) are compatible and (B,,M,),...,( By, Mw) are compatible then 
(Ar, Li)|]-°- (Aw, Ly) Ex (Bi, Mi)I| >>: |](Bx, May). 


2. if A C local(A,) and A C local(B,) then 
(Ar, L1)\AEx (Bi,Mi)\A 


3. Uf p is a mapping applicable to both A, and B, then 
p(( Ar, £1)) Ex p((B1,.M1)) 


Note, in Part 1 of the proposition, that even though (A,, £1),...,(An, Ly) are compatible, then 
the specifications (B,,M,),...,( Bn, Mwy) are not compatible if they contain internal actions 
that collide with already existing actions of other components. Thus, we must require that also 
(B,,M,),...,( By, My) be compatible. However, in practice the problem is usually solved by 
choosing brand new names for new internal actions in an implementation. Similar considerations 
apply to Parts 2 and 3. 
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2.2 The Model for Timed Systems 


The timed model, called live timed I/O automata, is very similar to the untimed model in that it 
consists of an automaton part (safe timed I/O automaton) and a liveness condition. Each state 
of the safe timed I/O automaton has an associated time, returned by the mapping .now, and a 
certain time-passage action v representing the passage of time. The steps of a safe timed I/O 
automaton are restricted such that time-passage steps must increase time and all other steps 
must not change time. Thus, all other steps than time-passage steps are thought of as occurring 
instantaneously. There are a few other restrictions representing natural properties of time. 


2.2.1 Safe Timed I/O Automata 


Times are specified using a dense time domain T = R2°, i.e., the set of non-negative reals. 
Definition 2.17 (Safe timed I/O automata) 
A safe timed I/O automaton A consists of five components 

e A set states(A) of states. 

e A nonempty set start(A) of start states (start(A) C states(A)). 


e A mapping .now, : states(A) > T (called .now when A is clear from context), indicating 
the current time in a given state. 


e An action signature sig( A) = (in(A), out(A), int(A)) of disjoint sets of input, output, and 
internal actions, respectively. Denote by ext(A) the set in(A) U out(A) U {v} of external 
actions, where v is a special time-passage action, by vis( A) the set in(A)U out(A) of visible 
actions, by local(A) the set out(A) U int(A) of locally-controlled actions, and by acts(A) 
the set ext( A) U int(A) of actions. 


e A transition relation steps(A) C states( A) x acts(A) x states(A). 
A must be input-enabled and satisfy the following five axioms 
$1 If s € start(A) then s.now = 0. 
S2 If (s,a,s') € steps(A) anda # v, then s'.now = s.now. 
S3 If (s,v,s’) € steps(A) then s’.now > s.now. 
S4 If (s,v,s’) € steps(A) and (s’,v,s") € steps(A), then (s,v,s’) € steps( A). 


To be able to state the last axiom, the following auxiliary definition is needed. Let J be an 
interval of T. Then a function w: [ — states(A) is an A-trajectory, sometimes called trajectory 
when A is clear from context, if 


1. w(t).now = ¢ for all ¢ € J, and 


2. (w(t), v,w(t’)) € steps( A) for all t,¢’ € P with t< t. 
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That is, w assigns to each time ¢ in the interval J a state having the given time ¢ as its now 
component. The assignment is done in such a way that time-passage steps can span between any 
pair of states in the range of w. Denote inf(/) and sup(J) by ftime(w) and Itime(w), respectively. 
If J is left closed, then denote w(ftime(w)) by fstate(w). Similarly, if J is right closed, then denote 
w(Itime(w)) by Istate(w). If [is closed, then w is said to be an A-trajectory from fstate(w) to 
Istate(w). An A-trajectory w whose domain dom(w) is a singleton set [t,t] is also denoted by 


the set {w(t)}. 
The final axiom then becomes 
S5 If (s,v,s') € steps(A) then there exists an A-trajectory from s to s’. 


Axiom $1 states that time must be 0 in any start state. Axiom S2 says that non-time-passage 
steps occur instantaneously, at a single point in time. In this framework, operations with some 
duration in time are modeled by a start action and an end action. Axiom S3 says that time 
passage steps cause time to increase. Axiom S4 gives a natural property of time, namely that if 
time can pass in two steps, then it can also pass in a single step. Finally, Axiom S5 says that if 
time can pass from time ¢ to time @’, then it is possible to associate states with all times in the 
interval in a consistent way. This axiom opens the possibility of specifying hybrid systems, ie., 
systems where the state can change coutinuously when time passes. However, in the systems we 
will look at in this work the states consists of a “basic” state and a now variable, and the basic 
state does not change during time-passage. 


2.2.1.1 Timed Executions 


The notions of executions and traces and operations on these carry over from the untimed 
setting. However, executions do not adequately capture the behavior of a system since they do 
not tell us what states the system goes through during time-passage. For this reason a notion 
of timed executions is introduced. 


A timed execution fragment & of a safe timed I/O automaton A is a (finite or infinite) sequence 
of alternating A-trajectories and actions in vis( A) U int(A), starting in a trajectory and, if the 
sequence is finite, ending in a trajectory 


y= Wy A1W1doaWe-s: 
such that the following holds for each index i: 


1. If w; is not the last trajectory in /, then its domain is a closed interval. If w; is the last 
trajectory of % (when » is a finite sequence), then its domain is a left-closed interval (and 
either open or closed to the right). 


2. If w; is not the last trajectory of S, then (lstate(w;), a;41, fstate(w;41)) € steps( A). 


A timed execution is a timed execution fragment wa,w d2w2--- for which fstate(wo) is a start 
state. 

If % is a timed execution fragment, then define ftime(S) and fstate(%) to be ftime(wo) and 
fstate(wo), respectively, where wo is the first trajectory of %. Also, define Itime(%) to be the 
supremum of the union of the domains of the trajectories of &. Finally, if © is a finite sequence 
where the domain of the last trajectory w is a closed interval, define Istate(X) to be Istate(w). 
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2.2.1.2 Finite, Admissible, and Zeno Timed Executions 


The timed executions and timed execution fragments of a safe timed I/O automaton can be 
partitioned into finite, admissible, and Zeno timed executions and timed execution fragments. 

A timed execution (fragment) © is defined to be finite, if it is a finite sequence and the domain 
of the last trajectory is closed. A timed execution (fragment) “ is admissible if ltime(%) = oo. 
Finally, a timed execution (fragment) % is Zeno if it is neither finite nor admissible. 

There are basically two types of Zeno timed executions: those containing infinitely many 
occurrences of non-time-passing actions but for which there is a finite upper bound on the times 
in the domains of the trajectories, and those containing finitely many occurrences of non-time- 
passing actions and for which the domain of the last state set is right-open. Thus, Zeno timed 
executions represent executions of a safe timed I/O automaton where an infinite amount of 
activity occurs in a bounded period of time. (For the second type of Zeno timed executions, the 
infinitely many time-passage steps needed to span the right-open interval should be thought of 
a the “infinite amount of activity”.) 

There are idealized processes that natually exhibit Zeno behaviors. As an example consider 
a ball which is bouncing on the floor and is losing a fraction of its energy at each bounce. Ideally 
the ball will bounce infinitely many times within a finite amount of time. Note, however, that 
the safe timed I/O automaton model cannot suitably model this process since there is no way 
of specifying what happens after the ball stops bouncing. On the other hand, Zeno behaviors 
will not occur in the computer systems we usually want to specify. 

Below we will be mostly interested in the admissible timed executions since they correspond 
to our intuition that time is a force beyond our control that happens to approach infinity. 

Denote by ¢-frag*(A), t-frag™(A), t-frag*(A), and t-frag(A) the sets of finite, admissible, 
Zeno, and all timed execution fragments of A. Similarly, denote by t-exec*(A), t-exec™(A), 
t-exec*(A), and t-exec( A) the sets of finite, admissible, Zeno, and all timed executions of A. 


A finite timed execution fragment “3, = woa iw, ---d,w, of A and a timed execution fragment 
Mo = SW dn41WngidnyoWn42°+: of A can be concateneted if Istate(X,) = fstate(X.). The con- 
catenation, written U, ~ Ne, is defined to be M = wodywy +++ dn(Wn ~ dn giWng1 Gn poWnpe ty 
where (w~ w’) is defined to be w(t) if ¢ is in dom(w), and w(t) if ¢ is in dom(w’)\dom(w). It is 
easy to see that / is a timed execution fragment of A. 

The notion of timed prefix, called t-prefix, for timed execution fragments is defined as follows. 
A timed execution fragment “, of A is a ¢t-prefix of a timed execution fragment “. of A, written 
My <; No, if either ©, = Ne or Y, is finite and there exists a timed execution fragment “1 of A 
such that NM. = U4, ~ 4. Likewise, \j is a t-suffix of Ny if there exists a finite timed execution 
fragment “4 such that Ny. = 4) 7X4. 

Define % ot, read “S before t”, for all t > ftime(%), to be the t-prefix of © that includes 
exactly all states with times not bigger than t. 

Likewise, define / c t, read “S after ¢”, for all ¢ < ltime(%) or all t < Itime(%) when & is 
finite, to be the t-suffix of & that includes exactly all states with times not smaller than ¢. 


2.2.1.3 Timed Traces 


In the untimed setting automata are compared based on their traces. This turns out to be 
inadequate in the timed setting because traces do not capture the invisible nature of time- 
passage actions and furthermore do not contain information about the time of occurrence of the 
visible actions. For this reason a notion of timed traces is introduced. We first define the notion 
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of timed sequence. 


A timed sequence over a set K is defined to be a (finite or infinite) sequence 6 over A x T in 
which the second components (the time components) are nondecreasing. Define 6 to be Zeno if 
it is infinite and the limit of the time components is finite. For any nonempty timed sequence 
6, define ftime(6) to be the time component of the first pair in 6. 

Now, let © = woa,w dqw,.--- be a timed execution fragment of a safe timed I/O automaton 
A. For each a;, define the time of occurrence t; to be Itime(w;_,), or equivalently, ftime(u;). 
Then, define ¢-seg(X) to be the timed sequence consisting of the actions in © paired with their 
time of occurrence: 


t-seq() = (a1, t1)(da, to) + +> 
Then ¢-trace(&), the timed trace of %, is defined to be the pair 
t-trace() = (t-seq(S) f (vis(A) x T), ltime(S)) 


Thus, t-trace(S) records the occurrences of visible actions together with their time of occurrence, 
and the limit time of the timed execution fragment. The timed trace suppresses both internal 
and time-passage actions. 

Let t-traces*(A), t-traces®(A), t-traces*(A), and t-traces(A) denote the sets of timed traces 
of A obtained from finite, admissible, Zeno, and all timed executions of A, respectively. 


2.2.1.4 Operations on Safe Timed I/O Automata 


As in the untimed setting, there are three operators defined on safe (timed) I/O automata. These 
are parallel composition, action hiding, and action renaming. The definitions are similar to the 
ones in the untimed setting except that special care has to be taken concerning the handling of 
time. For instance, in the parallel composition, all components must agree on real time. 


Definition 2.18 (Parallel composition) 
Safe timed I/O automata A,,...,Ay are compatible if for all 1 < i,j < N with i 47 
1. out(A;) NM out(A;) = 0 
2. int(A;) N acts(A;) = 0 
The parallel composition A,||---|| An of compatible safe timed I/O automata A;,..., An is the 
safe timed I/O automaton A such that 
1. states(A) = {(51,...,5y) € states(A,) x -+- x states(Ay) | 5;.now,, = ++: = sy.now,,,} 
2. start( A) = start(A,) x +--+ x start( Ay) 
3. (51,...,5y). now, = 5;.now,, (= S2.now,, = +++ = sny.now,,,) 
4. out(A) = out(A,) U---U out(Ay) 
5. in(A) = (in(A,) U-+-U in(Ay)) \ out(A) 
6. int(A) = int(A,)U---U int( Ay) 
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7. ((S1,---,5y), @, (5),-.., 84) © steps(A) iff for all 1 <i< N 
(a) if a € acts(A;) then (s;, a, s;) € steps(A;) 
(b) if a € acts(A;) then 5; = sj 
a 
Note, how Condition 7 of the definition captures both time-passage steps (where all components 
participate) and other steps (where a subset of the components participate). 

Just like (ordinary) execution fragments can be projected to components in a composed 
system, it is possible to define projection on timed execution fragments. If © = woa,wydgwe--- 
is a timed execution fragment of a safe timed I/O automaton A = A,||---||Ay, define S| A; to 
be the timed execution fragment of A; obtained by first projecting each state in the range of 
each trajectory to A;, and then, for each action a; ¢ acts(A;), removing a; and merging the two 
(projected) trajectories to the left and right of a;. (Thus, if none of the actions belongs to Aj, 


the result is one big trajectory representing time-passage of A,.) 
Action hiding and action renaming for safe timed I/O automata can also be defined. 


Definition 2.19 (Action hiding) 


Let A be a safe timed I/O automaton and let A be a set of actions such that A C local( A). 
Then define A \ A to be the safe timed I/O automaton such that 


1. states(A \ A) = states( A) 
2. start(A \ A) = start(A) 
3. .nowa\a = now, 

4. in(A\ A) = in(A) 

5. out(A\ A) = out(A)\ A 
6. int(A\ A) = int(A)UA 
7. steps(A \ A) = steps( A) 


Definition 2.20 (Action renaming) 


A mapping p from actions to actions is applicable to a safe timed I/O automaton A if it is 
injective, acts(A) C dom(p), and p(v) = v. Given a safe timed I/O automaton and a mapping 
p applicable to A, define p(A) to be the safe timed I/O automaton with 


L. states(p(A)) = states(A) 
2. starl(p(A)) = start(A) 
3. now, a) = now a 

4. in(p(A)) = plin( A) 

5. out(p(A)) = p(out(A)) 
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6. int(p(A)) = plint( A) 
7. steps(p(A)) = {(s, pla).s’) | (8,4, 8") € steps( A} 


2.2.2 Live Timed I/O Automata 


In the untimed setting a liveness condition for a safe I/O automaton A is a subset of the 
executions of A such that a special environment-freedom condition is satisfied. Similarly, in the 
timed setting a liveness condition for a safe timed I/O automaton is a set of timed executions 
such that a special timed version of the environment-freedom condition is satisfied. 

As in the untimed setting the environment-freedom condition is stated in terms of a game 
between the system and its environment. 

The notion of strategy is similar to the one used for the untimed case. However, the presence 
of time has a strong impact on the kind of interactions that can occur between an automaton 
and its environment. 

In the untimed case the environment is allowed to provide any finite number of input actions 
at each move, whereas the system is allowed to perform at most one of its locally-controlled 
steps at each move. In this way it is taken into account that the environment can be arbitrarily 
fast with respect to a system, however, not infinitely fast. In the timed case there is no need 
to assume the environment to be arbitrarily fast because each action occurs at a specific time. 
Therefore, the relative speeds of the system and the environment are given by their timing 
constraints. As a consequence the moves of the environment in the timed setting are input 
actions associated with their time of occurrence. Thus, the behavior of the environment during 
the game can be represented as a timed sequence over input actions. 

If a strategy in the timed setting decides to let time pass, it has to specify explicitly all 
intermediate states since the system must be able to respond to possible inputs during such 
a time-passage phase. Remember, that in our model it is generally not possible to deduce 
deterministically states at intermediate times given a time-passage step. 


Definition 2.21 (Strategy) 


Consider any safe timed I/O automaton A. A strategy defined on A is a pair of functions (9, f) 
where g : t-exec*(A) x in(A) — states(A) and f : t-exec*( A) — (traj( A) x local(A) x states(.A))U 
traj(A), where traj(A) denotes the set of A-trajectories, such that 


1. g(%,a) = s implies Na{s} € t-exec*(A) 
2. f(%) = (w,a,s) implies 4 ~ wa{s} € t-erec*(A) 
3. f(X) =w implies M7 w € t-ewec™( A) 


4. f is consistent, i.e., if f(%) = (w,a,s), then, for each t, ftime(w) < t < Itime(w), f(i7> 
(wot)) = (wet,a,s), and, if f(2) = w, then, for each t, ftime(w) < t < Itime(w), 
f(U 7 wot) =wet. 


For notational convenience define f(X).trj7 = 


4 - if f(%) = Ww, d, 8) 
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A strategy is a pair of function (g, f). Function f takes a finite timed execution and decides how 
the system behaves till its next locally-controlled action under the assumption that no input are 
received in the meantime, whereas function g decides what state to reach whenever some input 
is received. Condition 1 states that g returns a “legal” next state given the input. Conditions 
2 and 3 give two possibilities for the system moves given by f: either f specifies time-passage 
followed by a local step, or f specifies that the system simply lets time pass forever. Note, that 
f specifies all states during time passage. This is because, as mentioned above and as we shall 
see formally below, a move given by f might be interrupted by input actions, and in that case 
it is necessary to know the current state when the inputs arrive. The consistency condition 
(Condition 4) for f says that, whenever after a finite timed execution % the system decides to 
behave according to wa{s} or w, after performing a part of w the system would decide to behave 
according to the rest of the step wa{s} or w. The consistency condition is fundamental for the 
substitutivity results below. 

The game between the system and the environment works as follows. The environment can 
provide any input at any time, while the system lets time pass and provides locally-controlled 
actions according to its strategy. If an input arrives, the system will perform its current step 
till the time at which the input occurs, and then use function g to compute the state to reach 
after the input has occurred. 

In the timed setting the system might decide to perform a step at the same time at which 
the environment provides some input. Such situations are modeled as nondeterministic choices. 
As a consequence, the outcome, i.e., the result of the game, for a timed strategy is a set of timed 
executions. 


Definition 2.22 (Outcome of a strategy) 


Let A be a safe timed I/O automaton and (g, f) a strategy defined on A. Define a timed 
environment sequence for A to be a timed sequence over in(A), and define a timed environ- 
ment sequence Z for A to be compatible with a timed execution fragment » of A if either Z is 
empty, or & is finite and Itime(%) < ftime(Z). Then define Ri, 5), the next-relation induced by 
(g, f), as follows: for any %,%’ € t-exrec(A) and any Z,Z’ compatible with U,b’, respectively, 
(2,2), (242) € Rog,p) iff 


(U7 wa{s}, TZ) where © is finite, Z =e, f(%) = (w,a,s), 
(U7 w,T) where © is finite, Z =e, f(%) =o, 


(U7 wa{s},Z) where © is finite, Z = (6, t)Z”, f(X) = (w,a,s), 
(LoL) = ltime(w) < t, 


(U7 w'a{s'}, 7") where & is finite, Z = (a,t)Z", f(%).trj =a, 
ltime(w) >t, w =wot, g(h7o’,a)= 8’, or 
(,Z) where © is not finite. 


Let & be a finite timed execution of A, and Z be a timed environment sequence for A compatible 
with &. 


An outcome sequence of (g,f) given % and T is an infinite sequence (Y",Z”),>0 that satisfies: 


e (°, 2°) = (%,Z) and 
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e for all n > 0, ((2"7",2"~"), (U",2")) € Ry p)- 


Note, that (2"),>0 forms a chain ordered by t-prefix. 


The outcome Ocg,s)(%,Z) of the strategy (g, f) given % and Z is the set of timed executions 
b’ for which there exists an outcome sequence (U",Z"), 50 of (g, f) given % and Z such that 
YY = limy oo Ue”. 


In the definition of outcome of a strategy (g, f), the next-relation R,,;) determines allowable 
moves based on incoming inputs or performance of locally-controlled actions. In this way the 
outcome sequences of (g, f) given some ¥ and Z are determined step by step. 

In the definition of Rig 5), the first, second, and third cases deal with different situations 
where no input occurs during the system move chosen by f; the fourth case, instead, takes care 
of new incoming inputs; finally, the fifth case of the above definition is needed for technical 
reasons to generate a fixpoint in the outcome sequences since the second case generates an 
admissible timed execution. Note, that the third and fourth cases might both be applicable 
whenever an input occurs exactly at the same time at which the system decides to perform a 
locally-controlled action. This is the reason for which the outcome is a set of timed executions. 


Assume that the liveness condition for a safe timed I/O automaton could consist of Zeno timed 
executions only. If another safe timed I/O automaton has a liveness condition consisting of 
admissible timed executions, both of these systems could never work properly when composed 
in parallel since the first system would keep time from passing beyond some bound, which could 
never yield live timed execitions of the second system. (Remember that all components in a 
parallel composition have to agree on real time.) 

In this model this problem is solved by restricting attention to admissible timed executions 
since these timed executions correspond to our intuition that time grows unboundedly. Thus, in 
a live timed I/O automaton a liveness condition is a nonempty subset of the admissible timed 
executions. 

However, a problem arises as illustrated by the following example, which is due to Lamport: 
Consider two almost identical safe timed I/O automata with the following characteristics. They 
both have one input action and one output action, and if they receive an input before 12 o’clock 
they will issue an output after exactly half the time between the input was received and 12 
o’clock. Otherwise no output will be issued. To break the symmetry, one of the safe timed 
I/O automata will unconditionally issue an output some time before 12 o’clock. Both of these 
safe timed I/O automata have a nonempty set of admissible timed executions, so adopt these 
sets to be the liveness conditions of the safe timed I/O automata, respectively. Now, compose 
these systems in parallel by connecting the output of one system to the input of the other, 
and vice versa. Then the resulting system has no admissible timed executions but only Zeno 
timed executions where time is constrained from passing beyond 12 o’clock. Seen from any of 
the components the other component prevents time from passing, and none of the components 
will behave properly in the parallel composition. Thus, the parallel composition would not be 
an element of the model (since it has no admissible timed executions), which contradicts the 
requirement that the parallel composition operator be closed for live timed I/O automata. 

The problem illustrated in the example arises because the two components collaborate on 
performing the Zeno timed executions. To solve the problem, systems that can collaborate in 
this fashion need to be excluded from the model. We do this by identifying a special class of 
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Zeno timed executions, the Zeno-tolerant timed executions. A Zeno-tolerant timed execution is 
a Zeno timed execution containing infinitely many input actions but only finitely many locally- 
controlled actions. We denote by t-erec*'( A) the set of Zeno-tolerant timed executions of a safe 
time I/O automaton A. 

The Zeno-tolerant timed executions represent Zeno behaviors that are exclusively due to a 
Zeno environment. Thus, there is no collaboration between system and environment. This gives 
rise to a notion of Zeno-tolerant strategy. 


Definition 2.23 (Zeno-tolerant strategy) 


A strategy (g, f) defined on a safe timed I/O automaton A is said to be Zeno-tolerant if, for 
every finite timed execution % € t-exec*(A) and every timed environment sequence Z for A 
compatible with Y, Oc, (2,2) C t-exec™(A) U t-erec*'( A). 


Thus, any Zeno timed execution in an outcome of a Zeno-tolerant strategy is Zeno-tolerant and 
thus represents a behavior that is Zeno only because of Zeno inputs from the environment. Note, 
that in the Lamport example above it is not possible to find a Zeno-tolerant strategy defined on 
any of the two components: if one component behaves in a Zeno fashion, the other component 
will collaborate, and the resulting outcome cannot contain Zeno-tolerant timed executions. 

We are now ready to present the timed definition of environment-freedom. 


Definition 2.24 (Environment-freedom) 


A pair (A, L), where A is a safe timed I/O automaton and L C t-exec( A), is environment-free iff 
there exists a Zeno-tolerant strategy (g, f) defined on A such that for each finite timed execution 
4 of A and each timed environment sequence Z for A compatible with , Oc, ,)(%,Z) C L. The 
pair (g, f) is called an environment-free strategy for (A, L). 


A pair (A, L) is environment-free if, after any finite timed execution and with any (Zeno or non- 
Zeno) sequence of input actions, it can behave according to some admissible or Zeno-tolerant 
timed execution in A. 

This leads to the definition of live timed I/O automata, where the liveness condition con- 
tains only admissible timed executions, but where the strategy is allowed to yield Zeno-tolerant 
outcomes when given a Zeno timed environment sequence. 


Definition 2.25 (Live timed I/O automata) 


A live timed I/O automaton is a pair (A,L), where A is a safe timed I/O automaton and 
L C t-erec®(A), such that the pair (A, LU t-exec*'(A)) is environment-free. 


2.2.2.1 Operations on Live Timed I/O Automata 


The parallel composition, action hiding, and action renaming operators defined for safe timed 
I/O automata are now extended to live timed I/O automata in a fashion similar to the way the 
operators were extended in the untimed setting. 
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Definition 2.26 (Parallel composition of live timed I/O automata) 


Live timed I/O automata (Aj, £1),...,(An, Ly) are compatible iff the safe timed I/O automata 
Aj,,...,Ay are compatible. 

The parallel composition (A,,£,)||---||(An,£y) of compatible live timed I/O automata 
(A,, £1),...,(Aw, Ly) is defined to be the pair (A, L) where A = Aj||---||Ay and L = {b € 
t-exec™(A)| [Ai € Iy,...,u[An € Ey}. 


Definition 2.27 (Action hiding of live timed I/O automata) 


Let (A, L) be a live timed I/O automaton and let A be a set of actions such that A C local(A). 
Then define (A, L)\ A to be the pair (A \ A, L). 


Definition 2.28 (Action renaming of live timed I/O automata) 


A mapping p from actions to actions is applicable to a live timed I/O automaton (A, L) if it 
is applicable to A. Let © be a timed execution of (A, Z). Define p(X) to be the sequence that 
results from replacing each occurrence of every action a in “ by p(a). Given a live timed I/O 
automaton and a mapping p applicable to (A, L), define p((A, L)) to be the pair (p(A), p(L)). 


As expected the three operators above are closed for live timed I/O automata in the sense that 
they produce a new live timed I/O automaton. This is a consequence of the environment-freedom 


property. 


Lemma 2.29 (Closure of timed parallel composition) 


Let (Ai, 11),...,(An, Ly) be compatible live timed I/O automata. Then the parallel composition 
(Ay, £4)||---||(An, Ln) ts a live timed I/O automaton. 


Lemma 2.30 (Closure of action hiding) 


Let (A, L) be a live timed I/O automaton and let A C local( A). Then (A, L)\ A is a live timed 
I/O automaton. 


Lemma 2.31 (Closure of action renaming) 


Let (A, L) be a live timed I/O automaton and let p be a mapping applicable to (A,L). Then 
p((A, L)) is a live timed I/O automaton. 


2.2.3 Correctness 


In the timed setting the safe and correct implementation relations are based on timed traces. 
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Definition 2.32 (Timed implementation relations) 


Given two live timed I/O automata (A, Z) and (B,M) such that in(A) = in(B) and out(A) = 
out(B), define the following implementation relations: 


Safe: ACs B iff t-traces( A) C t-traces( B) 
Safe: (A,L)Cs (B,M) iff ACg B 

Correct: (A,L) Cr, (B,M) iff t-traces(L) C t-traces(M) 
| 


2.2.4 Substitutivity 


The timed model, like the untimed model, offers a modular approach to systems specification 
and verification as stated by the following substitutivity results. 


Proposition 2.33 (Substitutivity) 


Let (Aj, £;),(B;, Mi), t = 1,...,N, be live timed I/O automata with in(A;) = in(B;) and 
out(A;) = out(B;), and let Cy be one relation among Cg, and Cy. If, for each i, (A;, £;) Cx 


(Bi, Mi), then 


1. if (Ai, £1),...,(An, Ly) are compatible and (B,,M,),...,( Bn, Mw) are compatible then 
(Ar, L1)|]--- (Aw, Ly) Ex (Bi, Mi)I| +> [CB My). 


2. if A C local(A,) and A C local( B,) then 
(Ai, t1)\AEx (Bi,M1)\A 


3. if p is a mapping applicable to both A, and B, then 
p(( Ar, £1)) Ex p((B1,.M1)) 


2.3 Embedding Results 


The untimed model is used to specify systems where the actual amount of time that passes 
between actions is considered unimportant. Many problems in distributed computing can be 
stated and solved using this model. However, it is not possible to state anything about, e.g., 
response times. It is implicitly assumed that the final implementation on a physical machine is 
“fast enough” for practical usage. 

An untimed system can be thought of as a timed system that allows arbitrary time-passage, 
as long as possible liveness restrictions are satisfied. This indicates that our timed model is, in 
some sense, more general than our untimed model, and that we could use the timed model for 
all purposes. However, the timed model is more complicated than the untimed model due to 
the time-passage action, the .now component, etc., and furthermore it does not seem natural to 
have to deal with time, when the problem to be solved does not mention time at all. 

Thus, it is preferable to work within the untimed model as much as possible and only switch to 
the timed model when it is needed. The work in this report shows how the untimed specification 
(of the at-most-once message delivery problem) is implemented by a system that assumes upper 
time bounds on certain process steps and channel delays. Figure 2.1 depicts such a stepwise 
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Untimed 


Timed 


IMPL 


Figure 2.1 


A stepwise development from an untimed specification to a timed implementation. 


development. The question is of course what it means to implement an untimed specification 
by a timed implementation. Our approach is to convert the untimed levels to the timed model 
by applying an operator, called patient, that adds arbitrary time-passage steps as mentioned 
above. We then have an Embedding Theorem which states that if a concrete level implements an 
abstract level in the untimed model, then the patient version of the concrete level implements 
the patient version of the abstract level in the timed model, and vice versa. Thus, the first part 
of the stepwise development of Figure 2.1 can be carried out entirely in the simpler untimed 
model, and the last part in the timed model. In the intermediate development step which goes 
from untimed to timed, one must prove that the timed level implements the patient version of 
the untimed level. The embedding lemma can then be applied to show that the implementation 
IMPL implements the patient version of the specification SPEC. 
We start by defining a patient safe [/O automaton. 


Definition 2.34 (Patient safe I/O automaton) 
Let A be a safe I/O automaton where v ¢ acts( A). Then define patient(A) to be the safe timed 
I/O automaton with 

e states( patient(A)) = states(A) x T 


If s = (s',t) is a state of patient(A), we let s.basic denote s’. 


start(patient(A)) = start( A) x {0} 
© NOW parient(ay(Sst) = t 

¢ ext(patient(A)) = ext(A)U {v} 

¢ in(patient(A)) = in(A) 
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e out(patient(A)) = out(A) 
e int(patient(A)) = int(A) 
e steps(patient(A)) consists of the steps 


— {((s,t), 4, (s5,t)) | (s,4, 8") € steps(A)} 
~ {((s, 0), ¥,(s, 0) | > ty 


In order to state what it means to apply the patient operator to a live I/O automaton, we need 
the following auxiliary definition of what it means to untime a timed execution: Let A be a safe 
I/O automaton with v ¢ acts(A) and let © = woa widow, --+- be a timed execution of patient( A). 


Then define 
untime(X) = (fstate(wy).basic)a,(fstate(w, ). basic)as( fstate(w).basic) --- 
Similarly, let y = ((a1,t1)(@o,t2)-+-+,¢) be a timed trace of patient(A). Then define 
untime(y) = ayay-°- 


The notion of a patient live I/O automaton can now be defined. For any live I/O automaton 
(A, L), the patient live I/O automaton of (A, L) should be the live timed I/O automaton whose 
safety part is patient(A) and whose liveness part consists of all those admissible executions that, 
when being made untimed, are live according to L. Thus, the liveness condition of the patient 
live I/O automaton allows time to pass arbitrarily, as long as the liveness prescribed by L is 
satisfied. 


Definition 2.35 (Patient live I/O automaton) 


Let (A, LZ) be a live I/O automaton with »v ¢ acts(A). Then, define patient,(L) = {X € 
t-exec™ (patient(A)) | untime(X) € L} and define patient(A, L), the patient live [/O automaton 
of (A, L), to be the pair (patient( A), patient ,(L)). 


It can be proved that for any live I/O automaton (A, 1), patient(A,L) is a live timed I/O 
automaton. 


Lemma 2.36 
Let (A, L) be a live I/O automaton. Then patient(A, L) is a live timed I/O automaton. 
| 


We now state the Embedding Theorem, thus that the safe and correct implementation relations 
for live I/O automata coincide with the safe and correct implementation relations for the patient 
versions of the live I/O automata. 


Theorem 2.37 (Embedding Theorem) 
Let (A, L) and (B, M) be live I/O automata with v ¢ (acts( A) U acts(B)). Then 
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1. (A, L) Cs (B, M) iff patient( A, L) Cg, patient(B, M). 


2. (A, £) Cy (BLM) iff patient( A, L) Cy, patient(B, M). 


Finally we state a result which is important when doing specification and verification in a 
modular fashion. Namely, the patient operator commutes with the three operators on safe and 
live (timed) I/O automata. First, let =s, and =,, denote the kernels of the preorders Cg, and 
Cr, respectively.® 


Proposition 2.38 
Let (A, L) and (Ay, 11),...,(An, En) be live I/O automata and let =x be one of =s, and =r. 


1. Let (Ai, 14),...,(An, Ly) be compatible. Then, 
patient((A1, £1)||---||(An, Ly)) =x patient(A,, £,)||---||patient( An, Ly) 


2. Let A C local(A). Then, 
patient((A, L)\ A) =x patient(A,L)\ A 


3. Let p be an action mapping applicable to A and let p, be pU[v vu]. Then, 
patient(p(A, L)) =x p,(patient( A, L)) 


This concludes the introduction to the basic models of untimed and timed systems that we will 
use in this work. 


°The kernel of a preorder CL is defined to be the equivalence = defined by ts =y = xe CLyAyLua. 


Chapter 3 


A Temporal Logic with Step 
Formulas 


Chapter 2 defined the models of distributed systems we use in this work. One component of the 
models is the liveness condition which is a set of (timed) executions. Since such sets may be 
infinite (and each execution in the set may be an infinite sequence), it is necessary to have some 
way of denoting them without explicitly having to write down any executions. For this purpose 
we shall use a temporal logic which will be able to express properties of (ordinary) executions of 
safe (timed) I/O automata. Exactly how this temporal logic is used to specify liveness conditions 
for timed and untimed systems will be one of the issues of Chapter 4. This chapter is devoted 
to defining the temporal logic. 


In [MP92], Manna and Pnueli develop a temporal logic and give several examples of its use. 
For two reasons we cannot use their temporal logic directly. First, Manna and Pnueli evaluate 
temporal formulas over sequences of states and not over sequences of alternating states and 
actions. Second, they only deal with infinite sequences of states whereas (even live) executions 
of our systems may be finite. In a section below we show, however, how our temporal logic is 
related to that of [MP92]. 

The first reason suggests that maybe Lamport’s Temporal Logic of Actions (TLA) [Lam91] 
could be used. However, TLA is still state based in the sense that the semantics of a TLA 
formula is a set of sequences of states. Actions are in TLA merely state changes. It is possible 
that by having special TLA variables ranging over action names we could use TLA. However, 
due to the inherent importance of actions in our approach, we chose to develop our own temporal 
logic dealing with actions in a more intuitive manner. 


The rest of this chapter is organized as follows: In order to be able to state and prove results in 
this and later chapters, we start by introducing notions of stuttering and stuttering-equivalence 
in Section 3.1. Sections 3.2-3.4 then introduce the basic building blocks of our temporal logic: 
first, in Section 3.2, we introduce the notion of state functions and the special notion of state 
predicates. Section 3.3 then describes the notion of state transition functions, which are state 
functions that are evaluated over pairs of states. Finally, in Section 3.4, we introduce the 
important notion of step formulas. A step formula is a boolean valued function which is evaluated 
over steps. Thus, step formulas can express properties of both the states and the action of a 
step. 

Sections 3.5 and 3.6 now introduce the formulas of our temporal logic, i.e., the temporal 
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formulas, by first, in Section 3.5, giving some basic temporal operators and then, in Section 3.6, 
defining some important derived operators. In Section 3.7 we see how temporal formulas can be 
seen as formulas over safe (timed) I/O automata, and Section 3.8 deals with satisfaction and 
validity as well as validity with respect to safe (timed) I/O automata or sets of executions. 

Sections 3.9 and 3.10 provide results, mainly about special stuttering-insensitive formulas, 
which will prove very important in the next chapter. 

Then, in Section 3.11 we compare out temporal logic with that of Manna and Pnueli [MP92]. 
Finally, in order for our temporal logic to be useful for proving correctness of the protocols in 
the second part of this report, Section 3.12 provides certain rules of the logic. We do not in this 
work attempt to develop a completely axiomatized temporal logic, but merely state the rules we 
have found useful. Further research should investigate a basic set of rules of our temporal logic. 


Even though, strictly speaking, executions are only defined with respect to specific automata, 
we will in this chapter use the term “execution” to denote any alternating sequence of states 
and actions. As usual we let a range over executions. 


3.1 Stuttering 


For technical reasons which will become clear below, we introduce a notion of stuttering steps 
and stuttering-equivalence of executions. 

Denote by ¢ a special stuttering action. We will assume that ¢ cannot be used as an ordinary 
action of any safe (timed) I/O automaton. Below we will let A denote an arbitrary set of actions 
and, hence, it will always be the case that ¢ ¢ A. A stuttering step is any triple of the form 
(s,¢,s), where s is a state. 

Since ¢ can never be an action of a safe (timed) I/O automaton A, it can never occur in 
any execution of A. However, we will allow stuttering steps to occur in the more broad sense of 
executions used in this chapter. As we shall see below, we will not be able in temporal formulas 
to refer to the stuttering actions in executions, but it turns out to be important to be able to 
evaluate temporal formulas over executions possibly containing stuttering. 

Define fa to be the execution obtained by replacing every maximal (finite or infinite) sequence 
s¢s¢s--- in a by the single state s. Thus, the 4 operator removes all stuttering. Now, define 
two executions a, and a2 to be stuttering-equivalent, written a, ~ Qo, if fa, = fas. 

For any execution @ = 59018 @282°-- define 


~ a Ja if a is infinite 
a= . . . . 
8901810282 °°* An Sn68n¢65,°°° if @ is finite and ends in s,, 


Thus, if @ is finite, @ is the infinite execution obtained by concatenating infinite stuttering at 
the end of a. Clearly, a ~ @. 


3.2 States, State Functions, and State Predicates 


In Chapter 2 we defined the state space of a safe (timed) I/O automaton to be any set of 
individual states. We did not assume any structure of these states but merely assumed that 
states are names. In practical examples, especially those presented in this work, the state space 
will be described as a mapping from state variables to their values. Thus, a safe (timed) I/O 
automaton is assumed to contain a number of (typed) state variables, and the individual states 
are then distinguished by having different assignments of values to these state variables. For this 
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reason the temporal logic defined below will reference states using variable names. This approach 
is also used in [MP92, Lam91]. Below we will let V denote a set of variables. Furthermore, in 
order to avoid the complexity of carrying around the types of the variables, we assume that the 
type of a variable is given implicitly by the name of the variable. For example, 7, 7 and & will 
typically range over the natural numbers. 


We assume that we have a language for writing state functions—using variables, constants, 
standard operators, boolean connectives, and quantification—that can be evaluated over states. 
We will not give a language for writing down state functions since such languages are fairly 
standard. We refer to, e.g., [MP92] for a more thorough treatment of state functions. 

A state function over Y is a state function whose free variables are a subset of Y. If f is 
a state function over Y, then clearly f is also a state function over VU V’, where V’ is any set 
of variables. For any state function f over V and any V-state s (i.e., any assignment of proper 
values to all variables in V), we let s[f] denote the value of f in state s. 


A state predicate over Y is a boolean valued state function over VY. Below we shall see that state 
predicates are a special case of a more general notion of step formula. 


3.3. State Transition Functions 


A state transition function f over VY is a state function over VU V°, where V° is the set obtained 
by tagging each variable in V with °. State transition functions over V are evaluated over pairs 
(s,s’) of Y-states. The variables in V refer to state variables in s and variables in V° refer to 
the corresponding state variables in s’. Formally, the value of a state transition function f over 
V in a pair s,s’ of V-states, written (s,s) f], is defined as 


(s,8)Lf] = (sU[e? = s(x); 2 € VPLS] 


Action Functions and State Transition Predicates 


An action function f over (V,.A) is a state transition function over V that yields a subset of 
the actions in A when evaluated in any pair of V-states. Note, that the stuttering action ¢ can 
never be in the range of an action function. 


A state transition predicate P over VY is any boolean valued state transition function over VY. 


3.4 Step Formulas 


A step formula over (V, A) is a formula that can be evaluated over triples (s,a,s’), where s and 
s' are Y-states and a € AU{¢}, ie., step formulas are evaluated over (possibly stuttering) steps. 

There are two kinds of step formulas: those based on action functions and those based on 
state transition predicates. We consider these two possibilities and in each case we define what 
it means for a step formula P to hold in (s, a, s'), written (s,a,s') — P. 


If f is an action function over (V,.A), then (f) is a step formula over (V,A), and we define 
(sas)EU) if ae(ss)LI 


Since ¢ can never be in the range of f, the step formula (f) can never hold in a stuttering step. 
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A state transition predicate P over V is also a step formula over (V,.A), where A is an arbitrary 
set of actions, and we define 


(s,a,s')EP iff (s,s). P] = true 


3.4.1 State Predicates 


A state predicate P over Y can now be seen as a special case of a step formula, namely a state 
transition predicate over V that does not mention any variables in V°. Thus, consistent with 
the normal semantics of state predicates, we define what it means for a state predicate P over 
Y to hold in a Y-state s, written s — P, 


sEP iff (s,s) P] = true 


When defining temporal formulas below, we deal with step formulas and thereby also state 
predicates. 


3.5 Temporal Formulas 


An execution @ = 894181952 --- over (V, A) is an execution where each s; is a Y-state and each 
a; € AU {¢} such that if a; = ¢ then s;_, = s;. (Thus, stuttering actions can only occur in 
executions if they are part of stuttering steps.) Below we define the notion of temporal formulas 
P over (V, A), and what it means for such a formula to hold at position 7 € N in an execution a 
over (V, A), written (a,j) — P. (If a is finite, it is thought of as being extended with stuttering 
such that we can also define what it means for P to hold at positions 7 > |al.) 

A temporal formula over (V, A) contains only free variables in V and can only mention actions 
in A. Thus, a temporal formula over (V,.A) is also a temporal formula over (V UV’, AU A’), 
where Y’ is any set of variables and A’ is any set of actions. 


Let @ = 89@181d)82-+-+ below. 


Step Formulas 


Any step formula P over (V,.A) is also a temporal formula over (V,A) and we have, 


(a jJ/FP iff (O<j<Jal and (5;,4;41,5;41) FP) or 
(G > la) and (Sja),¢, Sja}) FP) 


Thus, for all positions 7 in a (except the last one if a is finite), P has to hold for the step 
starting in state s;. If a@ is finite and 7 is greater than or equal to the last position in a, P has 
to hold for the step that stutters the last state. 

The Next Operator 


If P is a temporal formula over (V,A), then so is CO P, read neat P. 


(a) KOP iff (a,j+I)EP 
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The Unless (Waiting-for) Operator 


If P and Q are temporal formulas over (V,.A), then so is P W Q, read P unless (or waiting-for ) 
Q. 


(a,j)/E PWQ iff either there exists a k > 7, such that (a, k) 
and for every 7 with 7 <7 < k, (a,?) 
or else for all i with i > 7, (a,t) E P 


Q, 
P, 


Quantification 


If P is a temporal formula over (V, A), then (Va: P) and (da : P) are temporal formulas over 
(V\ {0} A), 

For any V-state s denote by s%, where v is assumed to be in the type of the variable x, the 
(V U {z})-state obtained from s by either, if ¢ € V, changing the value of x in s to v, or, if 
a ¢ V, extending s with a mapping from z to v. Thus, s” = (s \ {a})U [a & v]. For any 
execution @ = 894181459 --- over (V, A), let a® denote the execution (59 )%a1( 51)" @2( 52)" +++ over 
(VU {a},A). With this definition, we can define the semantics of universal quantification. 


(a,j) E Va: P iff for all values v, (a%,7) E P 


Thus, P must, for arbitrary (proper) values v, hold for the execution where z is assigned the value 
v in every state. This is in [MP92] and [Lam91] known as quantification over rigid variables since 
the variable has a constant value during the execution. In [MP92] and [Lam91] quantification 
over a program variable x allows x to vary during the execution. We do not consider that kind 
of quantification in this work. 


Existential quantification is defined in a similar fashion. 


(a,j) F da: P iff there exists a value v such that (a%,7) E P 


Boolean Operators 


We give the standard definition of implication and negation. The remaning boolean operators 
will be derived from these below. 


If P and Q are temporal formulas over (V,.A), then so is P => Q, and we have 


(a,j) EF (P => Q) iff (a,j) P implies that (a,j)EQ 


If P is a temporal formula over (V,A), then so is —P, and we have 


(a,j) =P iff (a,j) FP 


Since we allow boolean operators in both state functions and temporal formulas, there might 
be an ambiguity as to how such boolean operators should be interpreted in a given temporal 
formula. For example, R = O(a =1=—> y = 2) can be regarded as obtained by A) applying 
the next operator to the step formula (« = 1 => y = 2), or B) first applying the temporal 
implies operator to the two step formulas x = 1 and y = 2, and then applying the nezt operator 
to the result. It turns out that either interpretation leads to the same result as to whether the 
formula holds at a certain position in an execution. However, to avoid confusion we adopt the 
convention that step formulas in temporal formulas are always “as large as possible”, thus, we 
consider R in the example to be produced as described in case A). 
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3.6 More Temporal Formulas 


The rest of the temporal operators can be described syntactically from W, = and 7. Below 
we assume that P and Q are temporal formulas over (V,.A). The formulas we define are then 
also temporal formulas over (V, A). 
More Boolean Operators 
Disjunction and conjunction are defined in the standard way. 

PVQ = (-P)=Q 

PAQ = AGP) V(-Q)) 


The Inclusive Unless Operator 


The W operator defined above requires a formula P to hold forever or, if another formula Q 
holds at some point, at least up to but not necessarily including the point where @ starts to 
holds. Often we need to express that P also holds in the state where Q starts to hold. For this 
reason we introduce the inclusive unless operator W; defined as 


PW.Q = PW(PAQ) 


The Always Operator 


To express that a formula holds forever, we define OP, read always P. 


P= PW false 


The Eventually Operator 


To express that sooner or later a temporal formula holds, we define OP, read eventually P. 


OP & =0(4P) 


The (Inclusive) Until Operator 


The unless operator expresses that a temporal formula P holds at least until another temporal 
formula Q starts to hold, but it does not require that Q eventually holds. (If Q does not hold 
eventually, P should hold forever). To express that Q is required to hold eventually, we define 
PUQ, read P until Q. 


PUQ = (OQ) A(PWQ) 
There is also an inclusive version of the until operator. 


PU; Q = (©Q)A(PW; Q) 


The Leads-To Operator 


The leads-to operator is an important temporal operator which expresses that during an execu- 
tion, if P holds at some point, then Q will hold at a later (or the same) point. Thus, P ~ Q, 
read P leads to Q, is defined as 


P~Q = OP = (¢Q)) 
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3.6.1 Precedence 


To avoid excessive use of parentheses, we use the following convention regarding the precedence 
(binding power) of the temporal operators. The operators in the group 


O oO a 


have equal precedence but higher precedence than the operators 


A NV 
which, in turn, have equal precedence but higher precedence than the operators 


which have equal precedence. 


3.7 Functions and Temporal Formulas over Automata 


For any safe (timed) I/O automata A whose state space is defined by state variables, denote 
by variables( A) the set of state variables of A. We say that f is a state function or state 
transition function over A if f is a state function or state transition function over variables( A), 
respectively. Similarly, f is said to be an action function over A if it is an action function over 
(variables( A), acts(A)). This notion trivially extends to step formulas and temporal formulas. 


3.8 Satisfaction and Validity 


An execution a over (V,.A) is said to satisfy a temporal formula P over (V,.A), written a — P, 
if and only if P holds at position 0 of a, thus 


ak P iff  (a,0)EP 


A temporal formula P over (V, A) is said to be valid, written E P, if every execution a over 


(V, A) satisfies P, thus 


EP iff for all a over (V,A),a EP 


We also introduce a notion of validity relative to a set FE’ of executions over (V,A). A temporal 
formula P over (V,.A) is then F-valid, written FE P, if every execution of F satisfies P, thus 


EEP iff forallac k,ae= P 


This notion extends to A-validity, where A is a safe (timed) I/O automaton. Then, for any 
temporal formula P over A, P is said to be A-valid, written A = P, if every execution of A 
satisfies P, thus 


AEP iff for all a € exec( A), a FE P 
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3.9 Finite vs. Infinite Executions 


Above a has ranged over infinite as well as finite executions. In this section we prove that the 
question whether a temporal formula P holds at position 7 in execution @ is equivalent to the 
question whether P holds at position 7 in a. This result is, of course, due to the semantics of 
step formulas which has a special case dealing with stuttering steps. 


Lemma 3.1 


Let P be a temporal formula over (V,A). Then, for all executions a over (V,A) and all j > 0, 
(a,j) EP off (a,j)E P 


Proof 
In Appendix B. 
a 


3.10 Stuttering-Insensitive Temporal Formulas 


A temporal formula P over (V,A) is stuttering-insensitive if, for arbitrary executions a, and 
Q, over (V,A) with a; ~ ag, a, E P if and only if ag E P. Thus, if P is stuttering-insensitive 
and holds for a, it holds for all executions that can be obtained from a by adding or removing 
stuttering. 


Below, in Proposition 3.4, we prove that certain types of temporal formulas are stuttering- 
insensitive. However, first we need two technical lemmas. 


Lemma 3.2 


Let P be a temporal formula over (V,A) and a = 8941 8,4)59-++ an arbitrary infinite execution 
over (V,A). Then, for all j > 0 and all i<j 


(a, 7) FE P off (j-«]@, ¢) FE P 


Proof 
In Appendix B. 
a 


Lemma 3.3 


Let a and a’ be infinite executions such that a ~ a’. Then, for all k > 0, there exists a k’ > 0 
such that 


1. ,|a x pla’ 
2. for all 0 <v < k’, there exists ani withO <i< k such that ;|a ~ ;|a’ 
Proof 


In Appendix B. 
a 
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We can now characterize certain temporal formulas which are stuttering-insensitive. State pred- 
icates are always stuttering-insensitive. This is because stuttering-equivalent executions will 
always start in the same state. General state transition predicates are not, however, stuttering- 
insensitive in general. This is due to the fact that stuttering-equivalent executions do not neces- 
sarily agree on the first step. All state transition predicates that hold in all stuttering steps are, 
however, stuttering-insensitive. Also, step formulas of the form (f) are not stuttering-insensitive, 


but O(f) is. 
For the temporal operators, formulas of the form © P are not stuttering-insensitive in gen- 
eral. Assume for instance that a, = 89@,8,;@)59--- and Qy = 89€590181@28)°--. Then a, % ap. 


Assume that (a,,j7) E P only if 7 = 1. Then a, E P but ay F P. Thus, © P is not stuttering- 
insensitive. However, all other temporal operators yield stuttering-insensitive temporal formulas 


when applied to stuttering-insensitive formulas. 


Proposition 3.4 


1. Every state predicate P is stuttering-insensitive. 


2. If P is a state transition predicate such that for all states s, (s,¢,s) —& P, then P is 
stuttering-insensilive. 


3. If f is an action function, then ©(f) is stuttering-insensitive. 
4. If P and Q are stuttering-insensitive, then 


a) PWQ, 
b) Ve: P, 
c) da: P, 
d) =P, and 
e) P= Q 


are all stuttering-insensitive. 


Proof 
In Appendix B. 
a 


3.11 Comparison with Manna and Pnhueli’s Temporal Logic 


The temporal logic of Manna and Pnueli [MP92] is state based in the sense that temporal for- 
mulas are evaluated over sequences of states, i.e., with no actions interleaved. These sequences 
(computations) must be infinite; terminating computations are made infinite by appending in- 
finite stuttering at the end. 

As Lemma 3.1 indicates we could also have chosen to deal with infinite executions only: any 
temporal formula in our temporal logic is satisfied by a finite execution a if and only if the 
temporal formula is satisfied by the infinite execution obtained by appending infinite stuttering 
at the end of a. This indicates that the use of infinite computations only in [MP92] as opposed 
to our use of both finite and infinite executions is not an important difference between the two 
logics. 
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The real difference lies in the important role of actions in our logic. We need to be able to 
express properties of the actions occurring in executions. However, as the following discussion 
indicates, several results of [MP92] carry over to our logic. 


Consider any (infinite) execution 
Q = S901 $1 A952°°° 
This execution can be encoded as the following state based computation: 
o = (89,41, 81)(S1, A, 52) °° 
Thus, each state of o is a triple. Specifically, states of o are assignments of the form: 


[ Ly r Vis 


a 
3 
J 
s 

5 


a 
3 
J 

a 


a 


where the variable assignments to 7,,...,%, represent the first state in a triple, the special 
variable act holds the action of the triple, and the variable assignments to x|,..., 2}, represent 


the last state in the triple. 

Now, any valid temporal formula of [MP92] holds, in particular, for computations, where 
each state has the form (s,a,s’) such that the last state of each triple coincides with the first 
pair of the next triple. Thus, valid formulas of [MP92] hold specifically for all computations 
that are encodings of our executions. 

In order for such validity results of [MP92] to carry over to our temporal logic, it is important 
that the operators of [MP92] that we also use have a similar semantics in the two temporal logics, 
but this is easy to see. In fact, we have been guided by the temporal logic of [MP92] when defining 
the semantics of our temporal operators. 

Note, that since our notion of execution in the encoding into computations is more restrictive 
than general computations, validities in our logic do not carry over to the temporal logic of 


[MP92]. 


3.12 Rules and Meta Rules 


Temporal logics, or any logic for that matter, usually contain inference rules which allow validities 
to be inferred from other validities. This is however not the way we shall use our temporal logic 
in the verification examples in this work. Typically, we are given a particular execution a@ which 
satisfies a temporal formula P and then have to show that a satisfies another temporal formula 
@. Thus, our proofs will be proofs of satisfaction as opposed to proofs of validity. 

So, for our purpose inference rules are not very useful. Instead we shall use rules of the form 
of valid implications. 


EP=>@Q 


Such a rule (together with the definition of implication) allows us to conclude a — Q from 
are P. 
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We now present the rules that we use in our correctness proofs below. We do not present simple 
rule like, e.g., manipulation of Boolean operators or rules like 


Par: E (OP) P 


but implicitly use such rules in our proofs. An approach like TLA [Lam91] has invested a lot of 
effort in finding rules that are typically used when proving systems correct. Such an investigation 
still needs to be done for our temporal logic. Thus, we present the rules we have found a need 
for in the particular examples presented in this work and leave the more general investigation 
for further research. We do not prove that the rules are actually validities but we note that 
this should follow easily from an encoding into the temporal logic of [MP92] as described in 
Section 3.11. In the rules we let P(k) denote a formula with & free. Then, e.g., P(0) is the 
formula obtained from P(k) by replacing all free occurrences of & with 0. 


MP: FE (((PL A... A Pr) => Q) AP, A+++ A Py) => Q 

MPI: E (O(P Q)AOCP)= OQ 

Prol: FE (VA: dk’: (k > kh’ A P(k)) ~~ P(k’)) = OP(O) 

Pro2: E (O(P (QW R)) A (QQ OS) A (QA S)~ R)) = (P~ R) 

Ind: KE ((P(0)~+ Q) AVE: (k > 0 => Fk’ (bh < kA (P(k) ~ PCR’) V Q)))) => 
Vn: (P(n) ~~ Q) 

Unl: E (O(P AQ) A (PW; Q)) P 

Unll: E (O(P (Q W; R)) A (QQ ©S)) (P (ORV OOS)) 


The rules allow us to prove that a given execution satisfies a formula, provided it satisfies another 
formula. We shall be using other rules, called meta rules, which cannot be stated as validities. 
For instance, if a — OP and a’ is a suffix of a, then a’ E OP. Again, we present the meta rules 
we have found useful in our particular examples, and leave an investigation of a “complete” set 
of meta rules as well as proofs of our meta rules for further research. We note, however, that 
many of the meta rules can be proved using Lemma 3.2. 


Lemma 3.5 
1. Ifa EOP and ao’ is a suffix of a, then a’ E OP. 


Tf, for all suffixes a’ of a, a E P, thena EOP. 


[fake OP, then there exists a suffix a’ of a such that a’ = P. 


If there exists a suffix a! of a anda’ E P, thena EOP. 


If, for any proper constant v, a — P(v), thenaE Vk: P(k). 


Ifa E Vk: P, then, for any proper constant v, a - P(v). 


If, for some proper constant v, a F P(v), then a — dk: P(k). 


Ss ry SP Ses LS & 


Ifa & dk: P(k), then there exists a proper constant v such that a — P(v). 
| 


Since, in our proofs below, we shall use the different parts of Lemma 3.5 extensively, sometimes 
we use several parts at once and then simply refer to the lemma and not the particular parts. 
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This concludes the introduction to our temporal logic. The temporal logic is especially designed 
so that formulas are evaluated over executions of safe (timed) I/O automata. This allows us 
to use the temporal logic to specify liveness conditions of live (timed) I/O automata and use 
the rules of the temporal logic in correctness proofs. Exactly how we use the temporal logic for 
specifying liveness conditions is one of the issues of the next chapter. 


Chapter 4 
Specifying Systems 


Chapter 2 introduced our basic models of timed and untimed systems. The models are entirely 
semantic: they describe the operational meaning of a system, that is, how a system behaves 
when executed. 

A live I/O automaton consists of mathematical objects like sets and lists. However, these 
sets and lists may be infinite, which indicates that a direct enumeration is not feasible. Thus, we 
need a language or some syntax, other than standard mathematical notation, for writing down 
elements of our models. This chapter describes the syntax we use. 

Furthermore, we describe how the effect of semantic operators (like parallel composition) is 
reflected in the syntax. For instance, we shall use the language of the temporal logic of Chapter 3 
for specifying liveness conditions. We then show, e.g., that under certain circumstances if the 
liveness of two systems are described by temporal formulas Q4 and Qz, respectively, then the 
liveness of the composed system is described by Q4 A Qg. This is important since it enables us 
to obtain a syntactic specification of the composed system directly from the specification of the 
component systems. 


The rest of this chapter is organized as follows. We first, in Section 4.1, deal with untimed 
systems and then, in Section 4.2, show how timed systems can be specified. Finally Section 4.3 
proves important embedding results. 


4.1 Specifying Untimed Systems 


4.1.1 Safe I/O Automata 


Safe I/O automata will be specified using the precondition-effect style normally used for speci- 
fying the I/O automata of [LT87, LT89]. 

This style assumes that the state space of the safe I/O automaton is described as a mapping 
from state variable names to their values. Thus, the state space of a safe I/O automaton will 
be described by listing the state variable names together with their types. The start states of 
a safe I/O automaton are then specified by giving the possible values the state variables can 
assume initially. 

As an example, consider the specification of a one-place buffer with the following functions: 
a message m can be placed in the buffer by the input action send(m) and removed from the 
buffer by the output action receiver(m). (The environment is thought of as sending messages 
to the buffer and receiving them from the buffer.) If a new message is sent to the buffer before 
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the previous message is passed on to the receiver, a special overflow flag is set, which leads to 
an output action overflow. Initially the buffer is empty and the overflow flag is not set. Thus, 
the state space and start state of this safe 1/O automaton is described as: 


Taiialy [Description 


buf Msg U {1} The one-place buffer. The symbol L denotes the 
empty buffer. 


Bool false The overflow flag. value of true denotes 
overflow. 


We denote by variables(A) the set of state variables of the safe I/O automaton A. We use the 
normal record-notation for referencing the values of state variables in a given state. For instance, 
the value of state variable buf in state s is denoted by s.buf. Formally, since s is a mapping 
from variables to values, we have s.buf = s(buf). 


The action signature of the one-place buffer is described as follows: 


Input: 
send(m), m € Msg 
Output: 
receive(m), m € Msg 
overflow 
Internal: 
none 


Thus, even though there might be infinitely many actions (Msg might be infinite), we use 
only finitely many action generator functions to describe these actions. (The action generator 
functions are assumed to be disjoint and their union to be injective). 


It now only remains to show how to define the transition relation. Generally, for each action 
generator function we define one or more step rules. For example, in the case of the action 
generator function send above we might want to define two step rules based on some partition 
of the messages Msg into Msg, and Msg,.. Then one step rule would define steps labeled with 
actions from {send(m) | m € Msg,}, and the other would define steps labeled with actions from 
{send(m) |m € Msg,}. The sets Msg, and Msg, could even be overlapping, in which case we 
introduce nondeterminism of the send steps. A step rule has the form 


agf(t,y,.-.) 
Precondition: 
P 
Effect: 
EB 


where agf is an action generator function over the variables x, y, etc., P is a precondition, and 
FE is an effect clause. 

The precondition P is a state predicate over the state variables of the system and the variables 
z,y, etc.. A particular action, say agf(1,2,...), is then enabled in state s, if P holds in s after 
replacing free occurrences of x with 1, free occurrences of y with 2, and so on. 

The effect clause F uses a Pascal-like style of assignments. Thus, the effect clause consists 
of a list of assignments (one per line) of the form 


VI=E 
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where v is a state variable and e is an expression (state function)—of the same type as v—over 
the state variables and the variables x, y, etc.. Again, for a particular action agf(1,2,...) we 
must replace free occurrences of « with 1, free occurrences of y with 2, and so on, in the expression 
e. If e’ denotes this instantiated expression, then if s is the state before the assignment, the 
result of executing the assignment is the state s’ obtained by changing the value of v to s[e’]. 
Thus, s’ = (s\{v})U[vt sfe’]]. The result of executing a list of assignments 


assignment, 


assignment, 


is obtained by first executing assignment,, then assignment,, and so on. Thus, the state will be 
changed in an sequential manner, but remember that this is just a convenient way of describing 
the post-state of the step, namely the state after the last assignment. In TLA [Lam91] the 
effects of steps are given by directly relating the values of the individual state variables in the 
pre- and post-states, but we have chosen this more program-like notation. 

To make some assignments conditional we use an if-then-else construct. An example of such 
a construct is, 


if P then 
assignment, 
assignment. 

else 
assignment. 
assignment , 


where P is a state predicate. The semantics is of course that if P holds when control has reached 
the if-statement, then assignments | and 2 are executed (in that order); otherwise assignments 
3 and 4 are executed. Note, that we use indentation to indicate the end of the if-then-else 
construct. This means that 


if P then 
assignment, 
assignment, 

else 
assignments 

assignment , 


is different from the previous if-then-else construct in that this construct first executes either 
assignments 1 and 2 or assignment 3 depending on the value of P, and then, unconditionally, 
executes assignment 4. We omit the else part of an if-then-else construct if it contains no 
assignments. 

The format of the effect-clause described so far does not allow nondeterminism for a particular 
action. To specify such nondeterminism we will use optional assignments of the form 


optionally 2 :=e 


with the meaning that nondeterministically either the assignment is or is not executed. 

We could have been more formal in defining the syntax and semantics of assignments, etc., 
but since such syntax and semantics are standard, we have chosen to keep the exposition at a 
more intuitive level. 
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Finally, we note that step rules may contain variables which are not state variables or vari- 
ables occurring in action generator functions. Such variables can be thought of as constants, 
and we then effectively defines a step rule for each proper value of the constant. An example is 
the following step rule, where n is such an extra variable. 


agf(t,y,.-.) 
Precondition: 
A0<n<10 
Effect: 


[76 v:=atn? 


Safe 1/O automata must be input-enabled (cf. Definition 2.1). This is ensured by omitting the 
preconditions for input actions. This has the same meaning as a precondition of true. The 
definition of the transition relation for the one-place buffer now looks like: 


send(m) receive(m) 
Effect: Precondition: 
if buf A L then buf =m 
of := true Effect: 
buf :=m buf := L 
overflow 
Precondition: 
of = true 
Effect: 
of := false 


An operational way to read such a definition is as follows. The definition for send(m) says that 
if the buffer receives a new message m when buf is not empty, the overflow bif of is set. After 
that the new message is placed in buf (and a possible previous message will thus be overwritten). 
The one-place buffer can perform a receive(m) step if m is the message in the buffer. The result 
is to empty the buffer. Finally, overflow can be signaled if the overflow flag of is set, and the 
result is that of gets reset to false. 


4.1.1.1 Operations on Safe I/O Automata 


In Section 2.1.1 we defined the three operators (parallel composition, action hiding, and action 
renaming) on safe 1/O automata. Below we explain how the safe I/O automata resulting from 
applying these operators can be described using syntax derived from the description of the safe 
I/O automata to which the operators were applied. 


We start by considering parallel composition of safe I/O automata. In Definition 2.2, which 
defines parallel composition, we defined a notion of compatibility for safe 1/O automata. This 
notion deals with guaranteeing that each action in a composed system be controlled by at most 
one component and that internal actions be unique. Definition 2.2 also says that the state space 
of a composed system is the cartesian product of the component state spaces. This means that 
if we want to reference the value of a certain state variable of one component, we first have to 
extract the state of the component from the total state. This becomes even more cumbersome 
if several levels of parallel composition are used. In order to avoid dealing with these not very 
interesting details of extracting component states of component states, etc., we will extend the 
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notion of compatibility to also include the requirement that the sets of state variables of the 
component systems be disjoint. In this way a state s of the composed system can be uniquely 
described by an assignment of values to the total set of state variables in the system such that 
the value of any state variable z in s agrees with the value of x in the state of the component 
to which x belongs. (More precisely, such a “flat” assignment of values to state variables is 
isomorphic to the state define by the parallel composition operator in Chapter 2.) Thus, if s; 
describes the state of the ith component as a mapping from state variables of this component 
to their values, the state of the composed system is described by the mapping s, U---U sy. 

Thus, below we shall use the following definition of compatibility (cf. Definition 2.2): Safe 
I/O automata A,,..., Ay are syntactically compatible if for all 1 < i,j < N with iF j 


1. out(A;) NM out(A;) = 0 
2. int(A;) N acts(A;) = 0 
3. variables(A;)  variables( A;) = 0. 


Note that the first two conditions have not changed. Below we let “compatibility” refer to 
“syntactical compatibility”. 

This notion of compatibility trivially extends to live I/O automata (cf. Definition 2.9). A 
consequence of this way of looking at the state space of a composed system is that for compat- 
ible safe I/O automata A;,...,Ay, the set of state variables of A = Aj||---||An is given by 
variables( A) = variables( A,)U---U variables( Ay) . 

Thus, the state variables (together with types and initial values) of a composed system can 
be described by writing the lists of state variables for the components one below the other. In 
a similar fashion it is easy to list the action signature of the composed system. 

The question is, how can the description of the steps of the composed system be derived 
from the description of the steps of the components? Remember, from Definition 2.2, that in 
each step of the composed system several components might participate (each executing state 
changes described locally for the action of that step) whereas all other components do not 
change their state. Also remember, that the action of the step is locally-controlled by at most 
one component. That is, either the action is an input action for all participating components, 
or it is locally-controlled by one component and an input action for the remaining participating 
components. Then, if the step rules for send(m) in three components, one of which controls the 
actions, are described by 


send(m) send(m) send(m) 
Precondition: Effect: Effect: 
P, Ey Es 
Effect: 
By 


then the send(m) steps of the composed system can be described by 


send(m) 
Precondition: 
Py 
Effect: 
Ey 
7p) 
La 
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Note, that the order of the three effect clauses is unimportant since £,, K2, and K3; mention 
disjoint sets of state variables. 

Since the construction of the step rules of the composed system is so simple, we usually omit 
the explicit construction and instead refer to the step rules of the components. 

For action hiding the situation is much simpler (cf. Definition 2.3). If, for instance, A is a 
safe I/O automaton and A is a set of locally-controlled actions of A, the syntactic description 
of A \ A is obtained from the syntactic description of A by simply moving the action generator 
functions describing output actions in A from the list of action generator function describing 
output actions to the list of action generator functions describing internal actions. Of course, 
if only some of the actions described by an action generator function are hidden, the action 
generator function will have to be split. For example, if send-nat(2), where i € N, is an action 
generator function for output actions of A, and A = {send-nat(i) | 7 > 100}, then send-nat(%), 
0 <2 < 99, will be in the listing of output actions of A \.A and send-nat(i), ¢ > 100, will be in 
the listing of internal actions of A \ A. 

Finally, for action renaming we use mappings of the form [send(m) > send-message(m) | 
m € Msg] U---, where, intuitively, each entire action generator function is being renamed. In 
this case each action generator function is simply replaced according to the action mapping in 
the syntactic descriptions of the action signature and the steps. 


In the remainder of this work we shall assume that the syntactic changes to safe (timed) I/O 
automata reflecting semantic operations on these are well understood and concentrate on the 
more interesting aspects of defining liveness. 


4.1.2 Live I/O Automata 


We specify a liveness condition L for a safe I/O automaton A indirectly in terms of a temporal 
formula Q over A in the following way: 


L= {a € erec(A)|aE- Q} (4.1) 


That is, the liveness condition F consists of all the executions of A that satisfy a certain temporal 
formula Q. Of course, we have to make sure that what we define is in fact a liveness condition 
for A, i.e., we must make sure that any finite execution of A can be extended to an execution 
in L. We shall refer to any temporal formula @ over A that defines a liveness condition L for 
A asa liveness formula for A. Moreover, we call the liveness formula environment-free for A if 
(A, L) is environment-free and thus is a live I/O automaton. 

Given a liveness formula Q for A, we shall refer to the liveness condition defined by (4.1) as 
the liveness condition for A induced by Q. 


4.1.2.1 Operations on Live I/O Automata 


In Section 2.1.2 we defined the three operators (parallel composition, action hiding, and action 
renaming) on live I/O automata. If our approach with specifying liveness using temporal for- 
mulas should have any practical relevance, it is important that the environment-free liveness 
formulas inducing the liveness conditions for the resulting live I/O automata can be obtained 
directly from the environment-free liveness formulas for the original live I/O automata. 

This section proves that this is the fact given a few restrictions. As always we start by 
the result for parallel composition, which requires three preliminary lemmas the first of which 
embodies the complexity of the proof. 
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To help us state and prove the results below, we first define a notion of restriction of an execution 
over (V,A) to (V’,A’). This notion is not similar to the notion of projection of executions to 
automata as defined in Chapter 2 since it introduces stuttering steps for actions not in A’, 
whereas the definition in Chapter 2 simply removes such steps. Below we shall, however, see 
how the two notions are related. 

For any V-state s, s | V’, where V’ C Y, is the V’-state obtained from the mapping s by 
restricting the domain to V’. 

Then, for any execution a over (V,.A), define a [ (V’, A’), where V’ C V and A’ C A, to 
be the execution over (V’, A’) obtained from a by replacing each state s in a with s [ VY’ and 
replacing each action a ¢ A’ with ¢. 


Lemma 4.1 
Let P be a temporal formula over (V', A’). Then, for all pairs (V,A) with V' CV and A’ CA, 
all executions a over (V, A), and all j > 0, 


(@fVA)DEP if (aj) FP 


Proof 
In Appendix B. 
a 


We now give an alternative characterization of the projection operator | on executions defined 
in Section 2.1.1. For any execution a of a safe I/O automaton Aj||---||Ay, define 


al A; = af (wariables( A;), acts( A;)) 
Then a[ A; = f(a f A;) and clearly we have a[ A; ~ a f Aj. 


The following lemma is now a direct consequence of Lemma 4.1. 


Lemma 4.2 


Let A,,...,Ay be compatible safe I/O automata and let Q,,...,Qn be temporal formulas over 
Aj,...,An, respectively. Furthermore, let A = A,||---||Ay and a € exec(A). Then, for all 
1<ai<N and all 7 > 0, 


(@TA,JFQ: Uf (a7) FQ: 


Proof 


Since @ is an execution over (variables(A), acts(A)) and each Q; is a temporal formula over 
(variables(A;), acts(A;)) with variables(A;) C variables( A) and acts(A;) C acts(A), the result 
follows directly from Lemma 4.1 and the definition of a [ A;. 


Lemma 4.3 


Let A,,...,Ay be compatible safe [/O automata and let Q,,...,Qn be stuttering-insensitive 
temporal formulas over A,,..., An, respectively. Let A = A,||---||Ay and a € exec(A). Then, 


al[A, FQ, and --- and alAy EF Qn iff aEQ,A...AQn 
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Proof 
In Appendix B. 
a 


The following important result for parallel composition can now be proved. 


Proposition 4.4 


Let (Ai, 11),...,(An, Ly) be compatible live I/O automata and let Qi,...,Qn be stuttering- 
insensitive temporal formulas over A,,...,An, respectively, such that each L; is induced by Q;. 


Let (A, LE) = (Ai, £,)||---||(An, fn). Then LE is induced by Qi A... A Qn. 


Proof 
In Appendix B. 
a 


It is important to understand the role that stuttering-insensitivity plays in the proposition. In 
the execution of a composed system, each step represents activity in a certain subset of the 
components while all other components do not engage in the step at all. When projecting the 
execution to any component, such steps where the component does not engage (i.e., stuttering 
steps) are simply removed. Thus, when specifying the liveness for a component system (A;, L;), 
we might write Q; = OO(a° = 2 +1) and hence specify that in any live execution (of (A;, L;)) 
there must be an infinite suffix where x is incremented by one at each step. Now, in a live 
execution a of the composed system, even though a[ A; satisfies Q;, a itself does not necessarily 
satisfy @; since steps performed by other components might result in « being incremented only 
in, e.g., every other step (but still, of course, incremented in every step where A; engages). In the 
proposition we solve the problem by simply ruling out @Q; since it is not stuttering-insensitive. 
However, in the example we might write the following stuttering-insensitive liveness condition 
which captures the same idea: Q} = OO(acts(A;)) A OO((acts(A;)) => (# = «+ 1)). Thus, 
QQ describes that there is a suffix, with infinite activity of A;, such that every time A; engages, 
x is incremented. 


Attention is now turned to the simpler operations of action hiding and action renaming. 


Proposition 4.5 


Let (A, L) be a live I/O automaton such that L is induced by the temporal formula Q for A and 
let A C local(A). Then the liveness condition of (A, L)\ A is induced by Q. 


Proof 
In Appendix B. 
a 


Proposition 4.6 


Let (A, L) be a live I/O automaton such that L is induced by the temporal formula Q for A, and 
let p be an action mapping applicable to (A, L). Define p(Q) to be the temporal formula obtained 
by applying p to every action function in Q. Then the liveness condition of p((A, L)) is induced 


by p(Q). 
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Proof 
In Appendix B. 
a 


4.1.2.2 Fairness 


Fairness is a special form of liveness, where the requirement is that each component of the 
system be given fair turns. Fairness is important since it in most cases is environment-free, 
and furthermore fairness is easy to implement on a physical system. Traditionally, two different 
kinds of fairness are considered: weak and strong fairness. 

Weak fairness to a system component or, as we shall phrase it, to the set of actions repre- 
senting this component says that actions from the set cannot be enabled indefinitely without 
being executed infinitely often. Thus, for a safe 1/O automaton A and a set C' C acts( A), weak 
fairness to C' can be expressed as the temporal formula 


WF 4(C) = OO(C) Vv UOnenabled4(C) (4.2) 


where enabled4(C) is a state predicate over A that holds in exactly the states of A where an 
action in C' is enabled. As usual we omit the subscript A and write WF(C) and enabled(C) 
when A is clear. 

We have in this work found it useful to use a slight variant of weak fairness in which actions 
are only forced to occur if they are enabled indefinitely and a special forcing condition is satisfied 
indefinitely. This can be formalized as 


WE(C,P) = OO(C) V OO-(enabled(C) A P) (4.3) 


where P is a state predicate (the forcing condition). When using this variant of weak fairness, it 
is possible to separate the issues of when actions may occur (are enabled) and when they must 
occur. 

Strong fairness says that actions from a set must be executed infinitely often if actions from 
the set are enabled infinitely often. In other words, we cannot ignore the actions forever if we 
are given infinitely many chances to execute them. 


SF(C) = AO(C) V OUnenabled(C) (4.4) 


Again, with a forcing condition this looks like 


SF(C,P) = QO(C) V OO-(enabled(C’) A P) (4.5) 


It is easy to see that temporal formulas of the form WF(C), WF(C, P), SF(C), or SF(C, P), 
where C' C acts(A) and P is a state predicate over A, are liveness formulas for A. But are they 
environment-free? First of all environment-freedom must require that C’ consist of only locally- 
controlled actions since otherwise we could be restricting the environment to perform certain 
input actions. This condition turns out to be sufficient for weak fairness to be environment-free. 
However, there is a problem with strong fairness as illustrated by the following example: Let LE be 
induced by the strong fairness formula SF(C') for A, where C C local(A). Then, for any infinite 
execution a in L it is the case that if C is enabled in infinitely many states in a, then a contains 
infinitely many actions from C’. Now suppose, in the game between system and environment, 
that each environment move consists of two input actions: one that is bound to enable C and 
one that is bound to disable C (thus no g function of a strategy can be defined to avoid that 
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C is enabled between the input actions and disabled afterwards). In this situation no strategy 
function f can be defined that can ever execute an action in C during such a game; in other 
words, every time the system gets a chance to move, it is not possible to execute an action in C 
since C' is not enabled. Thus, any strategy defined on A will, when playing against this villainous 
environment, generate an outcome in which C is infinitely often enabled (namely between the 
two input actions of every environment move) but in which only finitely many C' actions are 
executed. Thus the outcome is not live and it follows that SF(C) is not environment-free. 

However, strong fairness is environment-free if the safe I/O automaton in question is C’- 
persistent, where C’ C local( A). Define A to be C-persistent if for each state s of A in which C 
is enabled and each step (s,a,s’) where a € in(A), C is enabled in s’. Thus, in any execution of 
A, if C becomes enabled, C will stay enabled at least until a locally-controlled action has been 
executed. 


Lemma 4.7 


Let A be a safe I/O automaton and let Q;, 1 <i <k, be temporal formulas over A of the form 
WE(C;), WE(Ci, P;), SF(Ci), or SF(C;, P;), where 


e C; C local( A), 
e P; is a state predicate over A, and 
e if Q; = SF(C;) or Q; = SF(C;,, P;), then A is C;-persistent. 


Then Q, A+++ A Qs» is an environment-free liveness formula for A. 


Proof 


This proof can be carried out similarly to the proof of Lamport and Abadi’s Proposition 4 
in [AL92b]. (Note that [GSSL93] argues that Lamport and Abadi’s notion of j-machine- 
realizability is similar to our notion of environment-freedom. Furthermore, p-invariance is similar 
to our notion of C-persistence. ) 


Another important property of the fairness formulas is that they are stuttering-insensitive as 
expressed by the following lemma. 


Lemma 4.8 


Any conjunction of temporal formulas of the form WF(C), WF(C,P), SF(C), and SF(C, P) 


is stuttering-insensilive. 


Proof 
Directly by the definition of the fairness formulas and Proposition 3.4. 
a 


4.2 Specifying Timed Systems 


We now turn attention to timed systems. As above we first describe how to specify safe timed 
I/O automata, and then how to use our temporal logic to specify liveness. 
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4.2.1 Safe Timed I/O Automata 


In this work we use two approaches for specifying safe timed I/O automata: explicit and implicit 
specification. Both approaches describe state spaces using state variables as in the untimed 
setting. The definition of safe timed I/O automata (Definition 2.17) describes that the time can 
be obtained from any state by the .now mapping. Below we assume that 


each safe timed I/O automaton has a special now state variable such that the .now 
mapping simply returns the value of this state variable. 


(We will not be able to see if snow means the value of the now state variable in state s or the 
result of applying the .now mapping to state s, but since, by definition, both interpretations 
return the same time, this does not give rise to ambiguity. ) 

We denote by variables(A) the set of state variables (including now) of the safe timed I/O 
automaton A. With this definition we can extend the definition of compatibility for safe timed 
1/0 automata (cf. Definition 2.18) by requiring the state variables of the safe timed I/O automata 
be almost mutually disjoint. (They sets of state variables must only have now in common): Safe 
timed I/O automata A;,..., Ay are syntactically compatible if for all 1 < i,j < N with iF j 


1. out(A;) NM out(A;) = 0 
2. int(A;) N acts(A;) = 0 
3. variables(A;)M variables( A; ) = {now} 


As in the untimed setting we use, for brevity, the term “compatibility” to refer to syntactical 
compatibility. The notion of compatibility trivially extends to live timed I/O automata (cf. 
Definition 2.26). As in the untimed setting we can now characterize the state of a composed 
safe timed I/O automaton A = A,||---||Ay by a “flat” mapping from variables(A,) U-++U 
variables( Ay) (i.e., variables(A)) to values such that s is the state of A if s [ variables(A;) is 
the state the component A;. This characterization is possible since all components must agree 
on real time (cf. Definition 2.18). 


Explicit Specification 


The explicit approach to specifying safe timed I/O automata is similar to our way of specifying 
safe I/O automata: the state space and initial states are specified by a list of typed state 
variables with possible initial values (the now variable must assume the value 0 initially), the 
action signature is specified by using action generator functions to list input, output, and internal 
actions and the special time-passage action v, and the steps are specified using the precondition- 
effect style. 

Some of the state variables will typically be used to keep track of deadlines etc. Also, when 
specifying the steps using this explicit approach, the time-passage steps will have to be specified 
explicitly. The precondition for the time-passage steps will usually state that time is not allowed 
to pass beyond some deadlines representing times by which some other steps must have been 
executed. 

It must be proved that what we specify is in fact a safe timed I/O automaton (cf. Defini- 
tion 2.1). The axioms S1-S3 are easy to ensure: S1 is ensured by initializing now to 0, $2 is 
ensured by leaving now unchanged in the step rules for visible and internal actions, and S3 is 
ensured by requiring, in the step rule for v, that time will increase. S4 and S5 are ensured if 
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time-passage steps change the now variable only and, from any time, time-passage steps to any 
future time, possibly less than some deadline, is allowed. 


As in the untimed setting it is easy to construct the syntactic description of a safe time I/O 
automaton from the syntactic description of its components. The only difference compared to 
the untimed setting is constructing the step-rule for y when dealing with the parallel composition 
operator. In this case the preconditions of the step-rules for vy have to be combined so that all 
components allow the assignment to the (common) now variable. This turns out not to be a 
problem in practice. 


In some situations it is possible to avoid dealing explicitly with deadlines and time-passing when 
specifying safe timed I/O automata. This approach is described next. 


Implicit Specification 


In [MMT91] and [LA91] alternative models for timed systems are developed. We will refer to 
these models by “MMT-models” derived from the names of the authors of [MMT91]. As shown 
in [GSSL93] the model we use is a generalization of the MMT-models. 

In the MMT-models the locally-controlled actions are partitioned into classes and each class 
has associated with it a lower and upper time bound that represent the maximum and minimum 
delay of the system when executing these actions. 

While these models are sufficient for the specification of many timed distributed systems, 
they are not sufficient for all the examples presented later in this work. However, because the 
MMtT-models handle time implicitly, they tend to be easier to understand. 


Instead of developing a theory for MMT-models, we will merely, whenever possible, use the style 
of these models as a convenient way of specifying our safe timed I/O automata. So below we 
define a notion of MMT-specification and show what such a specification denotes in the model 
of safe timed I/O automata. 


Definition 4.9 (MMT-Specification) 
An MMT-specification Ayr is a triple where 


e automaton(Ajur) is a safe I/O automaton, 


e sets(Aywur) is a collection C,,...,C, of disjoint sets of locally-controlled actions of the 
safe I/O automaton automaton( Ayr), and 


e boundmap( Ayr) is a mapping that to each C; € sets( Ayr) associates a lower time 
bound b)(C;) € T and an upper time bound 6,(C;) € (T \ {0}) U {oo}, such that b.(C;) > 
b(C;). 


We let states( Ayr), etc., refer to the corresponding components of the underlying safe I/O 
automaton automaton( Ayr). 

The intuition behind an MMT-specification is as follows: Let the triple (A, 5,5) be an MMT- 
specification. A itself contains no information about time but we will now “execute” it in a world 
that has a notion of real time and now. Suppose during execution that a set C; € S becomes 
enabled at time ¢. Then 6 specifies that if C; stays enabled, then an action from C; must be 
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executed in the time interval [¢ + 6,(C;),t + b.(C;)]. Thus, the boundmap specifies the time 
interval (relative to ¢) in which an action from C; must be executed, unless C; becomes enabled 
in the meantime. The same has to hold for C; if it stays enabled after being executed; thus, in 
this case a new legal interval is calculated based on the current time, b)(C;), and 6,(C;). If C; 
becomes disabled, the timing constraints on C; are removed. 

To encode this idea into the model of safe timed I/O automata, we need to add several state 
variables. For instance we need to add the variable now representing real time, and for each 
of the sets C; we need to add two variables: first(C;) and last(C;) to denote the first and last 
(absolute) times at which an action from C; must be executed. In the encoding in our model, 
the first and last variables should then be set to the proper interval when the associated set 
C;, becomes (re-)enabled and reset to “no timing constraints” (i.e., the interval [0,00]) when 
C'; becomes disabled. Furthermore, actions in C; are only allowed to be executed if real time 
has passed beyond first(C;). Additional time-passage steps also need to be added. These steps 
should only change now and are not allowed to let time pass beyond any of the last bounds. 
This idea is now formalized. 


Definition 4.10 


Let Ayr be an MMT-specification. Then time(Ayyrr) is the safe timed I/O automaton A for 
which 


e each state s of states( A) consists of a state s.basic, which is a state of Ayr, augmented 
with a new state variable now and, for each set C; of sets( Ayr), two new state variables 


first(C;) and last(C;). 


start(A) consists of states s for which s.basic is a start state of Ayr, s.now = 0, and, 
for each set C; of sets(Ayur), if C; is enabled in s.basic then first(C;) = 6,(C;) and 
last(C;) = b,(C;); otherwise, first(C;) = 0 and last(C;) = oo. 


e ext(A) = ext(Ayur) U {v}. 
e (s,a,s’) € steps(A) iff the following conditions hold: 


1. If a € acts( Ayr) then 
(a) s’.now = s.now. 
(b) (s.basic, a, s'. basic) € steps( Ayr). 
(c) For each C; € sets(Ayurr): 
i. Ifa eC; then s.first(C;) < s.now. 
ii. If C; is enabled in both s.basic and s’.basic, and a ¢ C;, then s’.first(C;) = 
s.first(C;) and s’.last(C;) = s.last(C;). 
ili. If C; is enabled in s’.basic and either a € C; or C; is not enabled in s.basic, 
then s’.first(C;) = s’.now + b)(C;) and s’.last(C;) = s’.now + bu(C;). 
iv. If C; is not enabled in s’.basic then s’.first(C;) = 0 and s‘.last(C;) = oo. 
2. Ifa=v then 


(a) s’.now > s.now. 


(b) s’.basic = s.basic. 
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(c) s’.now < s'.last(C;) for all C; € sets( Ayurr). 
(d) s'.first(C;) = s.first(C;) and s’.last(C;) = s.last(C;) for all C; € sets( Ayarr). 


It is easy to see that téme(Ajyr) is in fact a safe timed 1/O automaton (cf. Definition 2.17). 
Specifically, axiom $1 is ensured since now is initialized to 0, S2 is ensured since, by explicit 
construction, now does not change in steps labeled by visible or internal actions, $3 is ensured 
since time-passage steps are explicitly required to increase time, and finally S4 and S5 are easily 
seen to be ensured since time( Ayr) from any time allows time-passage to any future time less 
than some deadline (expressed by the last variables) and time-passage steps do not change the 
basic part of the state. 

When using the implicit approach to specifying safe timed I/O automata, we use the 
precondition-effect style of Section 4.1.1 to specify the underlying safe I/O automaton, and 
then use standard notation (cf. Appendix A) to specify the sets of locally-controlled actions 
and the boundmap. Based on the simple way the new variables (now and the first and last 
variables) are manipulated, it is easy to construct an explicit description of time( Aygwr) based 
on the description of Ayr. 

We refer to Chapter 10 for an example of the implicit style of specification. 


4.2.2 Live Timed I/O Automata 


If we were to follow the lines of the untimed section when specifying the liveness condition 
for a safe timed I/O automaton, we should devise some temporal logic in which formulas were 
evaluated over timed executions. However, we take a different approach. The idea is that a 
timed execution can be characterized by a set of (ordinary) executions each of which can be 
thought of as a sampling of the timed execution. Thus, there exists a close relationship between 
timed executions and (ordinary) executions of a safe timed I/O automaton. 

We proceed by defining the notion of sampling. Then we define what constitutes a sampling 
characterization of a liveness condition, show how the operations on live timed I/O automata 
are reflected in the syntax describing the liveness of the live timed I/O automata, and finally 
discuss the notions of weak and strong fairness in the timed setting. 


4.2.2.1 Sampling 


All definitions and lemmas in this section are taken from [GSSL93] and are similar to those of 


[LV93b]. 


Roughly speaking, an (ordinary) execution fragment can be regarded as “sampling” the state 
information in a timed execution fragment at a countable number of points in time. Formally, 
we say that an execution fragment a@ = 894, 5,4.5)--- of A samples a timed execution fragment 


M = Wobiw,bow,--- of A if there is a monotone increasing mapping f : N — N such that the 
following conditions are satisfied. 
1. f(0) = 0, 


2. b; = ay) for alla > 1, 


3. a; = v for all 7 not in the range of f, 
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4. For all i > 0 such that w; is not the last trajectory in %, 


(a) 8 € rng(wy) for all j, fli) <j < fi+)), 
(b) spQy.now = ftime(w;), and 
(C) Sp(4ij-1-now = Itime(w,). 
5. If w; is the last trajectory in /, then 
(a) s; € rng(w;) for all 7, f(a) < J, 
(b) spQy.now = ftime(w;), and 
(c) sup{s;.now | f(t) < 7} = ltime(u;). 
In other words, the function f in this definition maps the (indices of) actions in © to corre- 
sponding (indices of) actions in a, in such a way that exactly the non-time-passage actions of a 
are included in the range. Condition 4 is a consistency condition relating the first and last times 
for each non-final trajectory to the times produced by the appropriate steps of a. Condition 5 
gives a similar consistency condition for the first time of the final trajectory (if any); in place of 
the consistency condition for the last time, there is a “cofinality” condition asserting that the 
times grow to the same limit in both executions. 
The following two straightforward lemmas show the relationship between timed execution 
fragments and ordinary execution fragments. 


Lemma 4.11 


Let A be a safe timed I/O automaton. If a € frag(A), then there is a timed execution fragment 
b € t-frag(A) such that a samples %. 


Lemma 4.12 


Let A be a safe timed I/O automaton. If \ € t-frag(A), then there is an execution fragment 
a € frag(A) such that a samples %. 


Recall that an execution fragment a is finite if it is a finite sequence. Furthermore, in the timed 
setting, an execution fragment a is defined to be admissible if there is no finite upper bound 
on the .now values of the states in a. Finally, an execution fragment is said to be Zeno if it is 
neither finite nor admissible. We denote by exec*(A), erec®(A), and exec” (A) the sets of finite, 
admissible, and Zeno executions of a safe timed I/O automaton A. 


Lemma 4.13 
If a samples i then 
1. a is finite iff & is finite, 
2. a is admissible iff Si is admissible, and 


3. a is Zeno iff % is Zeno. 
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It is possible to give a sensible definition of the timed trace of an ordinary execution fragment 
of a safe timed I/O automaton. Namely, suppose a = 894, 5,425) --- is an execution fragment of 
a safe timed I/O automaton A. First, define ltime(a) to be the supremum of the .now values of 
all the states in a. Then let 6 be the sequence consisting of the actions in @ paired with their 
times of occurrence: 


6 = (a1, 8;.n0w)( dz, 82.now)-+-. 
Then t-trace(a), the timed trace of a, is defined to be the pair 
t-trace(a) = (6 [| (vis(A) x T), Itime(a)) 


The following lemma shows that the definitions of timed traces for execution fragments and 
timed execution fragments are properly related: 


Lemma 4.14 
If a samples % then t-trace(a) = t-trace(%). 
| 


4.2.2.2 Sampling Characterization of Liveness Conditions 


As mentioned above we will characterize liveness conditions for safe timed I/O automata by a 
set of ordinary executions. 

Let A be a safe timed I/O automaton and let L, C exec®(A) be a set of admissible (ordinary) 
executions of A. Then JL, is said to be a sampling characterization of the set 


L={% € texec®(A) | for all a, if a samples 4, then a € L,} (4.6) 


That is, £ contains all those admissible timed executions of A that have all their samplings in 
L,. We say that LE is induced by the sampling characterization [L,. Note, that the sampling 
characterization [, may contain “extra” executions that are not samplings of any timed execu- 
tions in the set J induced by £,. (Such an extra execution will be the sampling of some timed 
execution “, but since all samplings of © are not in L,, © is not in L.) If L, coincides with 
the set of all samplings of all timed executions in the set LE induced by £,, i.e., if DL, does not 
contain any “extra” executions, then L, is said to be minimal. 

If the set L induced by JL, is a liveness condition for A, L, is said to be a liveness sampling 
characterization for A. Furthermore, if (A, LZ) is a live timed I/O automaton, ie., if (A, DU 
t-exec*'( A)) is environment-free, L, is said to be environment-free for A. 


A liveness sampling characterization for some safe timed I/O automaton A can now be specified 
indirectly in exactly the same way we defined liveness conditions in the untimed setting using 
temporal formulas. Thus, for any temporal formula Q over A we refer to the set 


L, = {a € exec*(A)|a EQ} (4.7) 


as the sampling characterization induced by Q. If L, is a liveness sampling characterization for 
A, @ is referred to as timed liveness formula for A. Furthermore, if L, is environment-free or 
minimal, @ is said to be environment-free or minimal, respectively. Finally, if L is induced by 
L, which, in turn, is induced by Q, we say that L is induced by Q. 
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4.2.2.3 Operations on Live Timed I/O Automata 


As in the untimed setting we now show how the liveness of live timed I/O automata obtained 
as results of the operators (parallel composition, action hiding, and action renaming) is induced 
by temporal formulas derived from the temporal formulas inducing the liveness of the live timed 
I/O automata to which the operators were applied. 


We start by looking at parallel composition and for that we need the following result, which 
expresses the relationship between sampling and projection ([). We state the result without 
proof (except we note that point 3 follows from points 1 and 2). 


Lemma 4.15 


Let A,,...,An be compatible safe timed I/O automata, A = Aj||---||An, and X% € t-exec(A). 
Then, for al 1<i< QN, 


1. if a samples \, then a[ A; samples “| Ay, 
2. if a; sample S| A;, then there exists an a such that a samples % and a; = a[ A;, and 
3. {a[A; | a@ samples S} = {a; | a; samples S| A;}. 


Lemmas 4.2 and 4.3 above for safe I/O automata are actually valid for safe timed I/O automata 
as well. We restate the timed version of Lemma 4.3. 


Lemma 4.16 

Let A,,..., Ay be compatible safe timed I/O automata and Q,,...,Qy be stuttering-insensitive 

temporal formulas over A,,..., An, respectively. Let A = A,||---||Ay and a € exec(A). Then, 
al[A, FQ, and --- and alAy EF Qn iff aEQ,A...AQn 

a 


The main result for parallel composition of live timed I/O automata can now be stated and 
proved. 


Proposition 4.17 


Let (Ai, £1),...,(An, Ln) be compatible live timed I/O automata and Q,,...,QN be stuttering- 
insensitive temporal formulas over A,,...,An, respectively, such that each L; is induced by Q;. 


Let (A, L) = (Ai, L,1)||---||(An, Ly). Then £ is induced by Qi A ...A Qn. 


Proof 
In Appendix B. 
a 


Attention is now turned to the simpler operations of action hiding and action renaming. 
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Proposition 4.18 


Let (A, L) be a live timed I/O automaton such that L is induced by the temporal formula Q for 
A and let A C local( A). Then the liveness condition of (A, L)\ A is induced by Q. 


Proof 
In Appendix B. 
a 


Proposition 4.19 


Let (A, L) be a live timed I/O automaton such that L is induced by the temporal formula Q for 
A, and let p be an action mapping applicable to (A, L). Define p(Q) to be the temporal formula 


obtained by applying p to every action function in Q. Then the liveness condition of p((A, L)) 
is induced by p(Q). 


Proof 
In Appendix B. 
a 


4.2.2.4 Fairness 


The fairness formulas (Equations (4.2)—(4.5)) presented in the untimed setting also express fair- 
ness requirements in the timed setting. However, fairness in the timed setting is not necessarily 
environment-free as in the untimed setting. 

The problem is that environment-freedom can be jeopardized because the system may col- 
laborate with the environment to generate non-Zeno-tolerant outcomes, as explained in Sec- 
tion 2.2.2, regardless of the fairness formulas. We do not investigate further if weak and strong 
fairness are environment-free for certain classes of safe timed I/O automata. 


4.3. Embedding 


In Section 2.3 we introduced the patient operator, which takes a safe or live I/O automaton as 
argument and returns the corresponding safe or live timed I/O automaton, respectively, that 
allows time to pass arbitrarily. 

The patient operator on safe 1/O automata (cf. Definition 2.34) adds an extra state compo- 
nent representing real time. When describing state spaces using state variables, we shall assume 
that the patient operator adds an extra state variable called now (as well as it adds the extra 
time-passage action v). Thus, we shall assume that now is not a state variable of any safe I/O 
automaton to which we apply patient. 

In Section 2.3 we described what it means to untime a timed execution of a patient safe 
I/O automaton. A similar definition can be given for ordinary executions: let A be a safe I/O 
automaton such that now ¢ variables( A) and v ¢ acts(A), and let A, = patient(A). Then for 
any a € exec(A,), define untime(a) to be the execution of A obtained from a by restricting 
every state to the state variables of A and removing every time-passage step (which do not 
change the state variables of A). Formally we have 


untime(a) = t(@ } (variables( A), acts(A))) 
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The following lemma, which we state without proof, says that the definition of wntime(a) is 
sensible. 


Lemma 4.20 


Let A be a safe I/O automaton such that now ¢ variables(A) and v ¢ acts(A), and let A, = 
patient(A). Then, for any % € t-exec(A,) and a € exec(A,), if a samples %, then untime(a) = 
untime(). 


Lemma 4.21 


Let A be a safe I/O automaton and let Q be a stuttering-insensitive temporal formula over A. 
Furthermore, let A, = patient(A). Then, for all a € exec(A,), 


untime(a) EF Q if akQ 


Proof 
In Appendix B. 
a 


We can now state and prove the main result of this section, namely that stuttering-insensitive 
temporal formulas carry over as environment-free liveness formulas when applying the patient 
operator. 


Proposition 4.22 


Let (A, L) be a live I/O automaton with L induced by a stuttering-insensitive temporal formula 
@ over A. Furthermore, let (A,,L,) = patient(A,L). Then, L, is induced by Q, and Q is 
minimal, 


Proof 
In Appendix B. 
a 


The minimality of Q as implied by the proposition will be important when proving that a live 
timed I/O automaton correctly implements the patient version of a live I/O automaton. In fact, 
as we shall see in the next chapter, our proof techniques in the timed setting requires liveness 
conditions of certain live timed I/O automata to be induced by minimal temporal formulas. 


This concludes this chapter. We have described how to specify safe (timed) I/O automata using 
a precondition-effect language and how to use the temporal logic defined in Chapter 3 to specify 
liveness. Furthermore, this chapter contains several results which state how operations in the 
semantic model are reflected in the syntax. 

Before we start the protocol verification example in Part II of this report, the next chapter 
deals with presenting a number of proof techniques for proving correctness. 


Chapter 5 


Proof Techniques 


The previous chapters have defined the general models of timed and untimed systems that we 
will use in this work, and described our approach to specifying objects of these models. This 
chapter is devoted to presenting a host of proof techniques for proving that one live (timed) I/O 
automaton correctly or safely implements another live (timed) I/O automaton. 

In Chapter 2 the notions of safe and correct implementation are defined. These notions are, 
for both untimed and timed systems, based on the (timed) traces that the involved systems 
can exhibit. For safe implementation, all (timed) traces are considered, whereas correct imple- 
mentation restricts attention to live (timed) traces. The respective implementation notions are 
then expressed as the subset relation between the sets of all/live (timed) traces of the involved 
systems. 

For untimed systems, reasoning about implementation directly in terms of trace inclusion 
is not feasible. First of all, traces are defined implicitly as the traces of the executions, and 
second, the liveness condition is defined implicitly as the set of executions that satisfy a certain 
temporal formula. Thus, the sets of traces and live traces are not readily available but are 
derived from safe I/O automata and temporal formulas. This calls for some proof techniques 
that are based on this available information and that are sound with respect to the safe and 
correct implementation relations. 

The same discussion is valid for timed systems as well. In timed systems there is even an 
extra level of indirection since the liveness condition of a live timed I/O automaton is usually 
induced by a sampling characterization which, in turn, is induced by a temporal formula. 

We first present, in Section 5.1, the proof techniques used for untimed systems, and then, in 
Section 5.2, these techniques are extended to timed systems. Most of the techniques are taken 
from [GSSL93] and are included here to make this report self-contained. We refer to [GSSL93] 
for details and proofs. 


5.1 Untimed Systems 


This section presents a number of techniques for proving the safe implementation relation and 
assist in proving the correct implementation relation for live I/O automata. The techniques 
are based on simulations between safe I/O automata, which are sound with respect to the safe 
implementation relation, i.e., trace inclusion. 

However, as shown in [GSSL93], it turns out that a stronger result can be proved for the 
simulation techniques: that there is a certain correspondence between the executions of the 
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Figure 5.1 


Example of a simulation. The actions a and 6 are external actions. The rest of the 
transitions are thought of as labeled by internal actions. 


involved safe I/O automata and not only between their traces. Since the liveness conditions of 
live I/O automata are stated in terms of executions and not in terms of traces, this result, which 
is called the Execution Correspondence Theorem, can form the basis for the proof of the correct 
implementation relation, i.e., live trace inclusion. 

Thus, when proving correct implementation between two live I/O automata, first a simulation 
result between the safe I/O automata parts is proved and then this simulation result and the 
Execution Correspondence Theorem are used to prove live trace inclusion. 


We proceed by defining a number of simulation proof techniques and stating the Execution 
Correspondence Theorem. Then we present the proof techniques for proving the safe and correct 
implementation relations. Finally, we consider the additional proof technique of adding history 
variables. 


5.1.1 Simulation Proof Techniques 


A simulation from A to B, where A and B are safe I/O automata with the same input and 
output actions, is a relation between the states of A and the states of B such that certain 
conditions hold. A will be referred to as the concrete, low-level, or implementation safe I/O 
automaton, and B as the the abstract, high-level, or specification safe I/O automaton. 

Exactly what conditions a simulation must satisfy depend on the kind of simulation. Below 
we define notions of, e.g., forward and backward simulations which differ in few but important 
respects. Generally, however, two conditions must be satisfied: first, the start states of the two 
safe I/O automata must be related in a certain way, and, second, each step of the low-level safe 
I/O automaton must “correspond” to a sequence of steps of the high-level safe I/O automaton. 

The second condition is depicted in Figure 5.1. For each step of the low-level safe I/O 
automaton, i.e., for each low-level step, there must exist a sequence of (high-level) steps of the 
high-level safe I/O automaton between states related—by the simulation relation—to the pre- 
and post-state of the low-level step, such that the sequence of high-level steps contains exactly 
the same external actions as the low-level step. How the sequence of high-level steps is selected 
depends on what kind of simulation is considered. 

Below forward simulations, refinement mappings, and backward simulations are defined. We 
refer to [GSSL93, LV93a, Jon91] for more details about these simulations. 

The simulation techniques use invariants of the safe I/O automata to restrict the steps 
needed to be considered. Define an invariant of a safe I/O automaton A to be any set of states 
of A that is a superset of the reachable states of A. Equivalently, an invariant can be defined to 
be a state formula over A that is satisfied by at least all reachable states of A. We will use the 
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two definitions interchangeably. 
The following notational convention is used: if R is a relation over 5, x S» and 5, € $,, then 
Ris;] denotes the set {52 € Ss» | (51,52) € R}. 


Definition 5.1 (Forward simulation) 


Let A and B be safe I/O automata with in( A) = in(B) and out(A) = out(B) and with invariants 
I, and Ig, respectively. A forward simulation from A to B, with respect to J, and Jp, is a 
relation f over states(A) x states(B) that satisfies: 


1. If s € start(A) then f[s] 9 start(B) F 0. 


2. If (s,a, 8’) € steps(A), s,s’ € I4, and u € f[s]M Jp, then there exists an a € frag"(B) with 
fstate(a) = u, Istate(a) € f[s’], and trace(a) = trace(a). 


We write A <p B if there exists a forward simulation from Ato B with respect to some invariants 
I, and Ip. If f is a forward simulation from A to B with respect to some invariants J, and Ip, 
we write A <p B via f. 


A refinement mapping is a special case of a forward simulation where the relation is a function. 
Because of its practical importance (cf. [AL91]) we give an explicit definition. 


Definition 5.2 (Refinement mapping) 


Let A and B be safe I/O automata with in( A) = in(B) and out(A) = out(B) and with invariants 
I, and Ig, respectively. A refinement mapping from A to B, with respect to J, and Ip, is a 
function r from states(A) to states(B) that satisfies: 


1. If s € start(A) then r(s) € start(B). 


2. If (s,a,s’) € steps(A), s,s’ € I4, and r(s) € Ip, then there exists an a € frag” (B) with 
fstate(a) = r(s), Istate(a) = r(s‘), and trace(a) = trace(a). 


We write A <p B if there exists a refinement mapping from A to B with respect to some 
invariants I, and Ig. If r is a refinement mapping from A to B with respect to some invariants 
I, and Ip, we write A <p Bviar. 


In a forward simulation there has to be a sequence of high-level steps starting from any of 
the high-level states related to the low-level pre-state and ending in some state related to the 
low-level post-state. The word “forward” thus refers to the fact that the high-level sequence of 
steps is constructed from any possible pre-state in a forward direction toward the set of possible 
post-states. 

In a backward simulation, on the other hand, there has to be a sequence of high-level steps 
ending in any state related to the low-level post-state and starting in some state related to the 
low-level pre-state. Thus, in a backward simulation the steps are constructed in a backward 
direction. 
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This difference between forward and backward simulations implies that they apply to dif- 
ferent situations. In some cases a forward simulation is needed whereas other situations might 
require a backward simulation. We shall see examples of this below. 

We need the auxiliary definition of image-finiteness. A relation R over S; x So is image-finite 
if for each s,; € $,, R[s,] is a finite set. 


Definition 5.3 (Backward simulation) 


Let A and B be safe I/O automata with in(A) = in(.B) and out(A) = out(B) and with invariants 
I, and Ig, respectively. A backward simulation from A to B, with respect to J, and Jp, is a 
relation 6 over states( A) x states(B) that satisfies: 


1. If s € Ly then b[s] O Ip #9. 
2. If s € start(A) then b[s] A Ip C start(B). 


3. If (s,a, 8’) € steps(A), s,s’ € Ty, and w’ € b[s]N Ip, then there exists an a € frag"(B) with 
Istate(a) = u’, fstate(a) € b[s] M Ip, and trace(a) = trace(a). 


We write A <p B if there exists a backward simulation from A to B with respect to some 
invariants [4 and J/g. If furthermore the backward simulation is image-finite, we write A <;, B. 
If 6 is a backward simulation from A to B with respect to some invariants [4 and Ip, we write 
A <p B (or A <ip B when 6b is image-finite) via 0. 


In [LV93a] abstract notions of history variables [OG76, AL91] and prophecy variables [AL91] are 
given in terms of history relations and prophecy relations. Below, in Section 5.1.5, we consider 
history and prophecy variables and show how history variables can be added to a specification. 


5.1.2 Execution Correspondence 


This subsection introduces the Execution Correspondence Theorem (ECT). The ECT states that 
if any of the simulations from above has been proven from a low-level safe I/O automaton A to 
a high-level safe I/O automaton B, then for any execution of A, there exists a “corresponding” 
execution of B. In order to formalize this notion of correspondence, the notions of R-relation 
and index mapping are first introduced. 


Definition 5.4 (R-relation and index mappings) 
Let A and B be safe I/O automata with in(A) = in(B) and out(A) = out(B) and let R be 


a relation over states(A) x states(B). Furthermore, let a and a’ be executions of A and B, 
respectively. 


a 


ao = ato b, Uy bots te 
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We say that a and a’ are R-related, written (a,a’) € R, if there exists a total, nondecreasing 
mapping’ m: {0,1,...,|a|} — {0,1,...,]a’]} such that 


'Tf, e.g., a is infinite (Jw| = co), then the set {0,1,...,]a|} is supposed to denote the set of natural numbers 
(not including oo), and i < |a| lets 2 range over all natural numbers but not oo. 
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2. (5;,Umia)) € & for all 0 <2 < Jal, 
3. trace(Om(s—1)41°**Omy) = trace(a;) for all 0 < i < al, and 
A. for all 7,0 <7 < |a’|, there exists an 7,0 <i < Jal, such that m(2) > 7. 
The mapping m is referred to as an index mapping from a to a’ with respect to R. 


We write (A, B) € R if for every execution a of A, there exists an execution a’ of B such that 


(a,a’) e€ R. 
| 


Thus, an index mapping maps indices of states in the low-level execution to indices of states in the 
high-level execution. Effectively, an index mapping maps low-level states to corresponding high- 
level states such that the start states correspond (Condition 1), corresponding states are related 
by & (Condition 2), and the external actions between two consecutive pairs of corresponding 
states are the same at both the low level and the high level (Condition 3). Condition 4 ensures 
that the high-level execution (a’) is not “too long”, i.e., a’ must not extend beyond the last 
state of a’ corresponding to a state in a (if such a state exists). (Note, that if a is finite, then 
a’ must also be finite. However, even if @ is infinite, a’ can be finite if the index mapping is 
constant for indices above some bound.) 


The Execution Correspondence Theorem of [GSSL93] is now stated. The theorem states that 
if a relation S$ has been proved to be a forward simulation, refinement mapping, or image- 
finite backward simulation from A to B, then for any execution of A, there exists an S-related 
execution of B. 


Theorem 5.5 (Execution Correspondence Theorem) 

Let A and B be safe I/O automata with in(A) = in(B) and out(A) = out(B). Assume for 
X €{F,R,iB} that A <x B via S. Then(A,B)ES. 

| 


5.1.3. Proving Safe Implementation 


The simulation proof techniques presented above are sound proof techniques for the safe imple- 
mentation relation. Before we state this result, we first show two results relating the traces of 
R-related executions. 


Lemma 5.6 


Let A and B be safe I/O automata with in(A) = in(B) and out(A) = out(B) and let R be a 
relation over states(A) x states(B). Assume that (a,a‘) € R and let m be any index mapping 
from a to a’ with respect to R. Then, for all 0 <1 < |al, trace(;|a) = trace(,,(:)|@"). 


Since for any execution a, 9/a = a and any index mapping maps 0 to 0, the following corollary 
is a direct consequence of Lemma 5.6. 
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Corollary 5.7 


Let A and B be safe I/O automata with in(A) = in(B) and out(A) = out(B) and let R be a 
relation over states(A) x states(B). If (a,a’) € R, then trace(a) = trace(a’). 


a 
Using this corollary and ECT, soundness of the simulation techniques can be proved. 


Theorem 5.8 (Soundness of simulations w.r.t. safe implementation) 


Let A and B be safe I/O automata with in( A) = in(B) and out(A) = out(B). Assume for some 
X ¢ {F,R,iB} that A<x B. Then AL B. 


5.1.4 Proving Correct Implementation 


A proof strategy for proving that a live I/O automaton (A, L) correctly implements another live 
1/0 automaton (B, M) is now described. 


Lemma 5.9 


Let (A, L) and (B, M) be live I/O automata with in(A) = in(B) and out(A) = out(B). Also, 
let L and M be induced by the temporal formulas Q;, and Qy, respectively. Assume for some 
X €{F,R,iB} that A<xy B via S. If, for all a € exec(A) and a’ € exec(B) with (a,a’) € S, 
aE Q, implies a’ F Qu, then (A,L) Cy (BM). 


Proof 


This lemma follows directly from a similar result in [GSSL93] and our definition of a liveness 
condition being induced by a temporal formula. 


Thus, we have the following proof strategy to prove that (A, /) is a correct implementation of 


(B,M): 
1. Prove a simulation $ from A to B with respect to some invariants. 


2. Assume @ and a’ are arbitrary executions of A and PB, respectively, and assume that 
(a,a’) € S and a is live (i.e., a F Qz). 


3. Prove that a’ is also live (i.e., a’ E Qu). 


This will usually be a proof by contradiction. That is, assume that a’ is not live and show 
that this leads to a contradiction. This strategy gives a nice way of splitting the proof 
into cases since being live usually means satisfying a conjunction of conditions such that 
not being live means not satisfying one (at least) of these conditions. Thus, each of the 
conditions can be considered separately. 


It is evident that this proof strategy needs a way to go from temporal formulas satisfied by the 
high-level execution a’ to temporal formulas satisfied by the low-level execution a. For this 
purpose we have identified the following two basic lemmas which will prove very useful in the 
verification examples in Part II of this report. 
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Lemma 5.10 


Let A and B be safe I/O automata with in( A) = in(B) and out(A) = out(B) and let R be 
a relation over states(A) x states(B). Furthermore, let a and a’ be executions of A and B, 
respectively, such that (a,a’) € R. Finally, let C be a set of external actions (from the common 
set of external actions). Then 


aE Oow(C) iff afk Oo5(C) 


Proof 
In Appendix B. 
a 


Lemma 5.11 


Let A and B be safe I/O automata with in(A) = in(B) and out(A) = out(B) and let R 
be a relation over states(A) x states(B). Furthermore, let a and a’ be executions of A and 
B, respectively, such that (a,a’) € R. Assume P and Q are state formulas over A and B, 
respectively, such that for all(s,u) € R, ifuE Q, then s — P. Then, 


if a’ EFOUQ then ak OUP 


Proof 
In Appendix B. 
a 


5.1.5 History and Prophecy Variables 


In [AL91] history and prophecy variables (together called auxiliary variables) are considered. 
It is shown that even though it is not possible to find a refinement mapping from A to B, by 
adding appropriate auxiliary variables to A to obtain A,,,. it is in most cases possible to find 
a refinement mapping from Agu, to B. Then, since A can be shown to be equivalent to (i-e., 
to have the same traces as) B, the soundness of refinement mappings implies that A safely 
implements B. 

History variables are only allowed to record the past history of the system. Thus, history 
variables are allowed in each step to be assigned a value based on all variables in the system, but 
must not affect the enabledness of actions or the changes made to other (ordinary) variables. 
As we shall see below, it is easy to syntacticly define how to add a history variable to a system. 

Prophecy variables, on the other hand, are much more complicated since they are allowed 
to constrain the future behavior of the system. It is not possible to give a general syntactic 
characterization of prophecy variables. 


In [GSSL93] and [LV93a] abstract notions of history and prophecy variables are given in terms 
of history relations and prophecy relations. A system A, is then said to be obtained from A 
by adding history variables if there exists a history relation from A to A), and similarly for 
prophecy variables. 

The motivation for adding, e.g., history variables to a specification A to obtain A, is to 
ensure that a refinement mapping from A, to some high-level specification B can be devised. 
But since the existence of a history relation from A to A; implies that there exists a forward 
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simulation from A to Aj, it is clear that it is possible to define a forward simulation directly 
from A to B and thereby avoid mentioning A; at all. (The forward simulation from A to B 
would be the composition of the forward simulation from A to A, and the refinement mapping 
from A; to B.) 

Similarly, instead of adding prophecy variables to A to get A, such that a refinement mapping 
from A, to B can be devised, it is possible to define a backward simulation directly from A to 
B. 

Now, since history variables can be defined using simple syntactic constraints, they are almost 
free to use, as opposed to prophecy variables. Thus, the approach we take is to use history 
variables whenever possible (which allows us to use refinement mappings instead of the more 
complicated notion of forward simulations) but to use backward simulations instead of having 
to deal with prophecy variables. Whether to use prophecy variables or backward simulations is 
a matter of taste and probably amounts to the same effort. When using backward simulations 
the complexity lies in showing that the relation is in fact a backward simulation, and when 
using prophecy variables the complexity lies in showing that the variables are in fact prophecy 
variables (which is done in a proof that actually has the flavor of a backward simulation). 


Syntactically Adding History Variables 


Let there be given a syntactic description of a safe I/O automaton A. Then a history variable 


h (¢ variables(A)) can be added to A to get A; as follows: 


1. To the list of state variables of A, append a line with h, the type of A, and the initial value 
of h. 


2. To each step rule of the form 


name 
Precondition: 
P 
Effect: 
EB 


an assignment to h may be added 


name 
Precondition: 
P 
Effect: 
EB 
h:=e 


where e€ is an expression that may mentions fA as well as other variables. Note, that 
the assignment to h may appear in an if-then-else statement, and that it may be moved 
anywhere in the effect clause since this does not affect the assignment of values to any of 
the other variables (but of course could affect the value assigned to h). 


For step rules for input actions, which have no precondition, the assignment to the history 
variable can be added to the effect clause similarly. 
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We say that A; is obtained from safe I/O automaton A by adding the history variable h if the 
syntactic specification of A, can be obtained from that of A by 1) and 2). In this case, clearly Ap 
is a safe I/O automaton and variables(A),) = variables( A) U {h}. The following simple lemma 
states the close correspondence between the steps of A and Ap,. 


Lemma 5.12 
Let A; be obtained from A by adding history variable h. Then, 


1. for each (s,a,s') € steps(A) and each s, € states(A,) with s;, | variables(A) = s, there 
exists a step (Sp, 4, 5},) € steps(A),) such that sj, [ variables( A) = s', and 


2. for each (sp, 4, 5),) € steps(A;,), (sp, [ variables( A), a, s}, | variables(A)) € steps(A). 
a 
Lemma 5.13 
Let A; be obtained from A by adding history variable h. Then, 


1. for each execution a € exec( A), there exists an execution a, € exec(A;,) such that a, [A = 
a, and 


2. for each execution a; € exec( Ap), a, [ A € exec(A). 
Proof 


In Appendix B. 
a 


Instead of proving the existence of a history relation from A to A; we directly prove that A 
safely implements A; and vice versa. 


Lemma 5.14 
Let A; be obtained from A by adding history variable h. Then A Cg An and A, Eg A. 


Proof 
In Appendix B. 
a 


We now turn attention to live I/O automata. Let (A, LZ) be a live I/O automaton and let Aj, be 
a safe I/O automaton obtained from A by adding history variable h. Define 


Ln = {ayn € exec(An) | an fA € Lt 


Then (Aj, L,) is a live I/O automaton since any environment-free strategy (g, f) for (A, L) can 
be trivially extended to an environment-free strategy (g;, fn) for (An, L,) by letting g, and fy, 
be like g and f except that they make arbitrary (possible) assignments to the history variable. 
We say that (Aj, L;,) is a live I/O automaton obtained from (A, L) by adding history variable 
h. 
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Lemma 5.15 


Let (Ay, Ln) be obtained from (A, L) by adding history variable h. Then (A,L) Cy (An, Ln) and 
(Aj, Ln) Ex (A, £). 


Proof 
In Appendix B. 
a 


The final lemma of this section deals with liveness formulas. 


Lemma 5.16 


Let (Aj, Ln) be obtained from (A, L) by adding history variable h, and assume that L is induced 
by Q. Then Ly is induced by Q. 


Proof 
In Appendix B. 
a 


We can now turn attention to similar techniques to be used in the timed setting. 


5.2 Timed Systems 


The structure of this section is similar to the structure of Section 5.1. 


5.2.1 Timed Simulation Proof Techniques 


There are only two minor differences between the simulation relations presented here and the 
simulation relations from the untimed case. First, states related by a simulation relation must 
have the same time. Second, since the trace operator on execution fragments does not adequately 
abstract from time-passage actions, the simulation techniques below use a notion of visible trace. 
For any timed automaton A and any execution fragment a of A, define the visible trace of 
a, written vis-trace4(a), or just vis-trace(a) when A is clear from context, to be a [ vis(A). 
Similarly, given any sequence of actions (3, define the visible trace of 3, written vis-trace 4(/3), 
or just vis-trace(Z) if A is clear from context, to be § [ vis(A). 

We now introduce the notions of timed forward simulations, timed refinement mappings, and 
timed backward simulations. 


Definition 5.17 (Timed forward simulation) 


Let A and B be safe timed I/O automata with in(A) = in(B) and out(A) = out(B) and with 
invariants J, and Jp, respectively. A timed forward simulation from A to B, with respect to I, 
and Ip, is a relation f over states(A) x states(.B) that satisfies: 


1. Ifw € f[s] then wu.now = s.now. 


2. If s € start(A) then f[s]M start(B) F 0. 
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3. If (s,a, 3’) € steps(A), s,s’ € I4, and u € f[s]M Jp, then there exists an a € frag"(B) with 
fstate(a) = u, Istate(a) € f[s’], and vis-trace(a) = vis-trace(a). 


Write A <,p B if there exists a timed forward simulation from A to B with respect to some 
invariants J, and Jpg. If f is a timed forward simulation from A to B with respect to some 
invariants J, and Ig, we write A <p B via f. 


Definition 5.18 (Timed refinement mapping) 


Let A and B be safe timed I/O automata with in( A) = in(B) and out(A) = out(B) and with 
invariants I, and Jp, respectively. A timed refinement mapping from A to B, with respect to 
I, and Ip, is a function r from states(A) to states(B) that satisfies: 


1. r(s).now = s.now. 
2. If s € start(A) then r(s) € start(B). 


3. If (s,a,s') € steps(A), s,s’ € Iy, and r(s) € Ip, then there exists an a € frag*(B) with 
fstate(a) = r(s), Istate(a) = r(s’), and vis-trace(a) = vis-trace(a). 


Write A <:r B if there exists a timed refinement mapping from A to B with respect to some 
invariants I, and Ig. If r is a timed refinement mapping from A to B with respect to some 
invariants J, and Ip, we write A <;p B viar. 


Definition 5.19 (Timed backward simulation) 


Let A and B be safe timed I/O automata with in( A) = in(B) and out(A) = out(B) and with 
invariants J, and Jpg, respectively. A timed backward simulation from A to B, with respect to 
I, and Ig, is a relation b over states(A) x states(B) that satisfies: 


1. If wu € b[s] then u.now = s.now. 
2. If s € Ly then bis] Ip ZO. 
3. If s € start(A) then b[s] A Ip C start(B). 


4. If (s,a,s’) € steps(A), s,s’ € I4, and w’ € b[s’]M Ip, then there exists an a € frag"(B) 
with Istate(a) = wu’, fstate(a) € b[s] A Ip, and vis-trace(a) = vis-trace(a). 


Write A <.g PB if there exists a timed backward simulation from A to B with respect to 
some invariants J, and Ig. If furthermore the timed backward simulation is image-finite, write 
A <p B. If 6 is a timed backward simulation from A to B with respect to some invariants I, 
and Ip, we write A <:p B (or A <j» B when 6 is image-finite) via b. 
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5.2.2. Execution Correspondence 


As in the untimed case, the simulation relations imply a certain correspondence between the 
ordinary executions of the involved timed automata. The following definition formalizes this 
correspondence, called timed R-relation, and defines a notion of tamed index mapping. The 
definition is similar to Definition 5.4 in the untimed model; the only differences are that the R 
relation must relate states with the same time and that the definition below deals with visible 
traces as opposed to traces, i.e., the same differences as in the simulations. 


Definition 5.20 (Timed k-relation and timed index mappings) 


Let A and B be safe timed I/O automata with in(A) = in(B) and out(A) = out(B), and 
let R be a relation over states(A) x states(B) such that if (s,u) € R, then s.now = u.now. 
Furthermore, let a and a’ be (ordinary) executions of A and B, respectively. 


Q = $98151A989°°° 


ao = ato b, Uy bots te 


Let a and a’ be timed R-related, written (a,a’) €, R, if there exists a total, nondecreasing 
mapping m:{0,1,...,]a]} — {0,1,...,]a’|} such that 


1. m(0) = 0, 
2. (5;,Umia)) € R for all 0 <2 < Jal, 
3. vis-trace(bmi—1)41°+*Omiay) = vis-trace(a;) for all 0 <i < Jal, and 
A. for all 7,0 < 7 < |a’|, there exists an 7,0 <i < |al, such that m(2) > 7. 
The mapping m is referred to as a tamed index mapping from a to a’ with respect to R. 


Write (A,B) €, R if for every execution a of A, there exists an execution a’ of B such that 


(a,a’) & R. 
| 


Now the Execution Correspondence Theorem for the timed case [GSSL93] can be stated. 


Theorem 5.21 (Execution Correspondence Theorem) 


Let A and B be safe timed I/O automata with in( A) = in(.B) and out(A) = out(B). Assume 
for X € {tF,tR,itB} that A<xy B via S. Then (A,B) eS. 


5.2.3. Proving Safe Timed Implementation 


Due to the fact that timed R-related executions have the same time in related states and have 
a correspondence between the their visible traces, it is possible to prove that timed /-related 
executions have the same timed traces. 


Lemma 5.22 
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Let A and B be safe timed I/O automata with in( A) = in(B) and out(A) = out(B) and let R 
be a relation over states(A) x states(.B) such that if (s,u) € R then s.now = u.now. Then, if 
(a, a’) €, R, then t-trace(a) = t-trace(a’). 


The soundness of the timed simulations with respect to the timed safe preorders can now be 
stated. 


Theorem 5.23 (Soundness of timed simulations w.r.t. safe timed implementation) 


Let A and B be safe timed I/O automata with in( A) = in(B) and out(A) = out(B). Assume 
for some X € {tF,tR,itB} that A<x B. Then A Cy, B. 


5.2.4 Proving Correct Timed Implementation 


We can prove the following result which is similar to Lemma 5.9 in the untimed setting. This 
lemma will form the basis of any proof of correct implementation in the timed setting. 


Lemma 5.24 


Let (A, L) and (B, M) be live timed I/O automata with in(A) = in(B) and out(A) = out(B). 
Also, let L and M be induced by Q, and Qy,, respectively, and assume that Qy is minimal. 
Assume for some X € {tF,tR,itB} that A <x B via S. If, for all a € exec®(A) and a’ € 
exec™(B) with (a,a’)€ S,aE Qr implies a’ = Qu, then (A, L) Cy, (B,M). 


Proof 


This lemma directly follows from a similar result in [GSSL93] and our definition of a sampling 
characterization being induced by a temporal formula. 


Lemma 5.24 can be used to prove the correct timed implementation relation between two live 
timed I/O automata in a manner similar to the way Lemma 5.9 is used in the untimed model. 
However, one must first prove that the high-level liveness condition is induced by a minimal 
timed liveness formula. 

The following lemmas correspond to Lemmas 5.10 and 5.11 above. 


Lemma 5.25 


Let A and B be safe timed I/O automata with in( A) = in(.B) and out( A) = out(B) and let R be 
a relation over states(A) x states(B) such that if (s,u) © R, then s.now = u.now. Furthermore, 
let a and a’ be executions of A and B, respectively, such that (a,a’) € R. Finally, let C be a 
set of visible actions (from the common set of visible actions). Then 


aE Oow(C) iff afk Oo5(C) 
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Proof 
Similar to the proof of Lemma 5.10. 
a 


Lemma 5.26 
Let A and B be safe timed I/O automata with in(A) = in(B) and out(A) = out(B) and let R be 


a relation over states(A) x states(.B) such that if (s,u) € R, then s.now = u.now. Furthermore, 
let a and a’ be executions of A and B, respectively, such that (a,a’) € R. Assume P and Q are 
state formulas over A and B, respectively, such that for all (s,u) € R, ifuE Q, thens — P. 
Then, 


if a’ EK OO@ then akOOP 


Proof 
Similar to the proof of Lemma 5.11. 


5.2.5 History and Prophecy Variables 


As in the untimed setting it is possible to add history variables to safe and live timed I/O au- 
tomata. As above we only deal with history variables and adhere to timed backwards simulations 
instead of using prophecy variables. 


Syntactically Adding History Variables 


The syntactic rules for adding history variables to a safe timed I/O autoamaton are equal to 
the same rules in the untimed setting. However, in the timed setting, we do not allow history 
variables to be updated in time-passage steps since otherwise the resulting object would not 
necessarily be a safe timed I/O automaton (that is, the trajectory axiom $5 of Definition 2.17 
could be violated). Thus, a history variable h (¢ variables(A)) can be added to a safe timed 
I/O automaton A to get A, by following the two rules in Section 5.1.5 with the restriction 
that A must not be changed in the step rule for the time-passage action v. We say that Aj, is 
obtained from A by adding the history variable h. Clearly A; is a safe timed I/O automaton 
and variables(A;,) = variables( A) U {h}. 


In previous chapters we have defined how to restrict ordinary executions to subsets of state 
variables and actions. Below we need a similar result for timed executions, however, we need 
only deal with restriction to a subset of the state variables. So, let © = woajw,dagw.--- bea timed 
execution of a safe timed I/O automaton A. Then, for any set V C variables( A), define 4 [ V to 
be the sequence wia;w) d2w!,---, where for each index 7 and each t € dom(w;), wi(t) = w,(t) TV. 
Thus, informally © [ V is obtained from ™ by restricting all states in the range of all trajectories 
to V. If A, is obtained from A by adding history variable h and U, € t-exec( A,), we let %;, fA 
be a shorthand for “i, [ variables( A). 


As in the untimed case, we have the following lemmas. 


Lemma 5.27 
Let A; be obtained from A by adding history variable h. Then, 
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1. for each (s,a,s') € steps(A) and each s, € states(A,) with s;, | variables(A) = s, there 
exists a step (Sp, 4, 5),) € steps(A),) such that si, [ variables( A) = s', and 


2. for each (sp, 4, 5),) € steps(A;,), (sp, [ variables( A), a, s}, | variables(A)) € steps(A). 


Lemma 5.28 
Let A; be obtained from A by adding history variable h. Then, 


1. for each timed execution Xi € t-exec(A), there exists a timed execution Yj, € t-exec( Aj) 
such that %;, f A= %, and 


2. for each timed execution Mp, € t-exec(A;,), U;, | A € t-exec( A). 
Proof 


In Appendix B. 
a 


These lemmas allow us to prove that a safe timed I/O automaton A is a safe implementation of 
any safe timed I/O automaton A, obtained by adding history variable h to A, and vice versa. 


Lemma 5.29 
Let A; be obtained from A by adding history variable h. Then A Cg, An and Ap Eg, A. 


Proof 
Similar to the proof of Lemma 5.14 by using Lemma 5.28. 
a 


Now, let (A, LZ) be a live timed I/O automaton and let A, be a safe timed I/O automaton 
obtained from A by adding history variable A. Define 


Th, = {xh € t-exec™(A),) | Mp fA € LT} 


Then (Aj, Ly) is a live timed I/O automaton since any environment-free strategy (g, f) for (A, LU 
t-erec”'(A)) can be trivially extended to an environment-free strategy (gn, fn) for (An, Lx U 
t-exec“‘(A;,)) by letting g, and f, be like g and f except that they make arbitrary (possible) 
assignments to the history variable. We say that (Ay, L,) is alive timed I/O automaton obtained 
from (A, Z) by adding history variable h. 


Lemma 5.30 


Let (An, L,) be obtained from (A, L) by adding history variable h. Then (A, L) Cur (An, Ln) and 
(An, Ln) Crs (A, £). 


Proof 
Similar to the proof of Lemma 5.15 by using Lemma 5.28. 
a 
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Before we can prove the final lemma, which deals with timed liveness formulas, we state the 
following trivial result without proof. 


Lemma 5.31 


Let A; be obtained from A by adding history variable h. Furthermore let a, and ip range 
over exec( A,) and t-exec(A;,), respectively, and let a and % range over exec( A) and t-exec( A), 
respectively. Then, 


1. if a, samples Xi, then a, | A samples Si, | A, and 
2. if a samples Si, [| A, then there exists an a, such that a = a; | A and a, samples My. 
a 


Lemma 5.32 


Let (Aj, Ln) be obtained from (A, L) by adding history variable h, and assume that L is induced 
by Q. Then Ly is induced by Q. 


Proof 
In Appendix B. 
a 


This concludes the theoretical part of the report. We now turn attention to the verification 
example of proving correctness of two solutions to the at-most-once message delivery problem. 


Part II 


Reliable At-Most-Once Message 


Delivery Protocols 


A Protocol Verification Example 


Chapter 6 


Specification 5 


This chapter describes the top-level specification of the “at-most-once message delivery” prob- 
lem. The specification will be given in terms of a live I/O automaton. The objective of the $ 
level is to give a clear, easy-to-understand specification that can easily be checked to have the 
desirable behavior. 

The at-most-once message delivery problem is that of delivering a sequence of messages 
submitted by a user at one location to another user at another location. Ideally, we would like 
to insist that all messages be delivered in the order in which they are sent, each exactly once, 
and that an acknowledgement be returned for each delivered message. 

Unfortunately, it is expensive to achieve these goals in the presence of failures (e.g., node 
crashes). In fact, it is impossible to achieve them at all unless some change is made to the 
stable state (i.e., the state that survives a crash) for each message. To permit less expensive 
solutions, we weaken the statement of the problem slightly. We allow some messages to be lost 
when a node crash occurs; however, no messages should otherwise be lost, and those messages 
that are delivered should not be reordered or duplicated. (The specification is weakened in this 
way because message loss is generally considered to be less damaging than duplicate delivery.) 
Now it is required that the user who sent the message receive either an acknowledgement that 
the message has been delivered, or in the case of crashes, an indication that the message might 
have been lost. 

Even though our specification S is centralized (i.e., has no distributed structure), the external 
actions of S can be partitioned into actions connected to the user at the sender side and actions 
connected to the user at the receiver side. This user interface, which will be the same for all 
subsequent implementations, is depicted in Figure 6.1, where the specification S is shown as a 
“black box”. 

A user can send a message m to the system by issuing a send_msg(m) action, and the system 
can pass a message m to the user at the receiver end by means of a receive_msg(m) action. 
Crashes at the sender and receiver sides are modeled as inputs crash, and crash,, respectively? , 
and the corresponding recovery actions are outputs recover, and recover,. If a crash, but not 
yet a recover, action has occurred, we say the the sender side is crashed or equivalently that 
it is in recovery phase. Correspondingly for the receiver side. During a crash messages can be 
lost. This is in S modeled by a lose(/) actions (not depicted in Figure 6.1 since it is internal). 


‘Our definition of at-most-once message delivery is different from what some people call at-most-once message 
delivery in that we include acknowledgements and require messages to be delivered in order. 

? We will use subscripts s and r on actions and state variables to indicate which are related to the sender and 
receiver sides, respectively. 
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send_msg(m receive_msg(m 


Sender side Receiver side 


crash, 


TECOVET ¢ recovery 


Figure 6.1 


The specification 5 as a ” black box” 


Finally, there is a simple acknowledgement mechanism incorporated into the specification. 
An action ack(b), where 6 is a Boolean, notifies the user at the sender side about the status of 
the last message sent. If acknowledgements are needed for each message, the user must wait for 
acknowledgement before sending the next message. Our simpler acknowledgement mechanism 
reflects the way typical low-level protocols work. Thus, if the user sends a sequence of messages 
m1,...,M, without waiting for acknowledgement between each pair of messages, a subsequent 
acknowledgement will be for message m,. Ideally, an ack(true) should be issued if the last 
message sent has been successfully delivered to the receiver, and an ack(false) should be issued 
if the last message has been lost during a crash. This is, again, impossible to obtain in a 
distributed implementation unless some changes are made to the stable state for each message, 
so we will use a weaker acknowledgement mechanism: if an ack(true) is issued, the last message 
has been successfully receiver. If, on the other hand, an ack(false) is issued, the only thing the 
user can infer is that a crash has occurred. Thus, even in the case of negative acknowledgement, 
the last message might have been successfully delivered since all messages are not necessarily 
lost during crashes. 


6.1 The Specification of 5 


We now define the live I/O automaton representing the specification S$. We will let S represent 
both the name of this level of development and the name of the live I/O automaton. 

We specify S by defining its components (cf. Definitions 2.1 and 2.8). We refer to the safe 
I/O automaton part of S by Ag, and to the liveness part by Ls. Thus, S = (Ag, Lg). Ls will be 
specified implicitly by an environment-free liveness formula Qs for Ags. 


6.1.1 States and Start States 


In $ and the lower level protocols we assume that messages are taken from a set Msg. We require 
that nil ¢ Msg but assume no other properties of Msg. 

The state space of S$ is made up of four state variables as shown in the following table, which 
furthermore shows the types and initial values of the state variables. The status variable ranges 
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over the set 


Stat = Bool {?} 


Taitially 
The list of messages sent but not yet delivered. 


T€C, Bool false true iff the sender side has crashed and not yet 
even 
TEC, Bool false true iff the receiver side has crashed and not yet 
a eC 


status Stat false Indicates the status of the last message sent. The 
special value ’?’ indicates that the last message 
sent is still in guewe and no crashes have occurred 
since it was sent. 


6.1.2 Actions 


The set of actions of S consists of the input and output actions from Figure 6.1 plus the internal 
lose(I) action. 


Input: 
send_msg(m), m € Msg 
crash. 
crash, 
Output: 
receive_msg(m), m € Msg 
ack(), 6 € Bool 
TECOVER g 
TECOVEry 
Internal: 


lose(I), ON 


6.1.3 Steps 


The transition relation steps( Ag) will be specified using the precondition-effect style presented 
in Section 4.1.1. 


send_msg(m) receive_msg(m) 
Effect: Precondition: 
queue := queue” m queue ZEA 
status :=? head( queue) = m 
Effect: 
ack(b) queue := tail( queue) 
Precondition: if queue =e A status =? then 
status = b status := true 
Effect: 
none 
crash. crash, 
Effect: Effect: 


rec, = true recy := true 
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lose(L) 
Precondition: 
(recs = true V recy = true) A I C dom(queue) 
Effect: 


if queue # € A masida( queue) € I 
status := false 

else 
optionally status := false 

queue := delete( queue, I) 


rECOVEF « reCOVETy 
Precondition: Precondition: 
rec, = true recy = true 
Effect: Effect: 
recs := false recy := false 


The function delete in the step rule for lose(J) deletes messages with indices in J from queue. 
Formally, for any list g and any set I C dom(q), define 


delete(q,T) = (q[i]| i € dom(q) Ai ¢ I) 
The notation to the right of = is defined in Appendix A. 


The handling of queue, rec,, and rec, in the step rules is self-explanatory. The handling of 
status is a bit more complicated: when a new message m is sent to the system (modeled by 
send_msg(m) steps), status is changed to ? to indicate that the last message sent is in queue. 
When a message is delivered to the receiver (modeled by receive_msg(m) steps) and queue 
thereby becomes empty, status should be changed to true, but only if the message delivered 
is in fact the last message sent and not another message, which happens to be last on queue 
because the last message sent has been lost in a crash. Thus, at any point a status value of ? 
indicates that the message at the end of queue is actually the last message sent by the sender. 
This explains the recetve_msg(m) steps. The lose(/) action then records if the message at the 
end of queue is lost by changing status to false. (If the message at the end of quewe is not the 
last message sent, status would already be false). On the other hand, if the message at the end 
of queue is not deleted, we are still allowed to change status to false according to our informal 
description of the acknowledge mechanism given in the introduction to this chapter. 

Note, that it is possible for the system to output a positive acknowledgement for a message 
and then “change its mind” and start issuing negative acknowledgements. However, this change 
of mind can only happen during a crash. (In such a situation the user knows that the last 
message has been delivered since she has received a positive acknowledgement. ) 

Another thing to note is the fact that the ack(b) steps do not disable themselves. Thus, once 
status becomes true or false, acknowledgements can be sent continuously until a new message 
is put into queue by a send_msg(m) step. (Actually, with the liveness restrictions we present 
below, acknowledgements must be issued infinitely often if status stays true or false, and no 
crashes occur.) A remedy to this situation would be to introduce an additional flag, which is 
set when status is changed from ? to a Boolean, and reset when an acknowledgement is issued. 
Acknowledgements should then only be enabled when this flag is set. We have chosen not to 
introduce the flag since it would only add few interesting aspects to the implementations. 
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6.1.4 Liveness 


We now present the environment-free liveness formula Qs for As, which induces the liveness 
condition Ls. The liveness we specify for S$ is weak fairness to four sets of locally-controlled 
actions. Two of these sets have associated forcing conditions. Note, that lose(/) actions are not 
in any set since we do not want to force the system to lose anything. Informally, the sets and 
forcing conditions are. 


1. ack(6) actions 
Forcing condition: rec, = rec, = false 
2. receive_msg(m) actions 


Forcing condition: rec, = rec, = false 
3. TECOvEr, 


4. recover, 


With these liveness restrictions we guarantee that in the absence of crashes, messages in queue 
will be delivered and acknowledgements for the last message will be issued unless new messages 
are sent to the system. Furthermore, both the sender side and the receiver side are guaranteed 
to recover after a crash. (This requirement on recovery could be removed from all levels of 
abstraction without affecting other liveness properties. All interesting liveness properties are, in 
fact, conditioned by the assumption that no new crashes occur.) 


The liveness requirements can be formalized in the following way. Let 


Csi =  {ack(true), ack(false)} 

Cy 2 = {receive.msg(m) | m € Msg} 
Css =  {recover,} 

Cs =  {recover,} 


Then the formalization of Qs is 


Qs = WE (C1, rec, = false A rec, = false) A 
WE (Cs 9, rec, = false \ rec, = false) A 
WE(Cs3) A 
WE (Cs 4) 


By Lemma 4.7, Qs is an environment-free liveness formula for As. Thus, S = (Ag, Ds) is a live 
I/O automaton. Furthermore, by Lemma 4.8, Qs is stuttering-insensitive. 


This concludes the formal specification of the at-most-once message delivery problem. 


Chapter 7 


Delayed-Decision Specification D 


In our specification 5, presented in Chapter 6, we saw that it is allowed to lose any number of 
messages in the system, but only if either rec, or rec, is true, i.e., we can only lose messages 
between crash and recovery. In the low-level protocols we consider, the choice whether or not 
to lose a message because of a crash may be postponed until after recovery and the choice 
is dependent on certain race-conditions on the network channels: a message m traveling on a 
channel and the receiver have no way of knowing if the sender has crashed, so even if the sender 
has crashed, the message might still be successfully received by the receiver. But, if the sender 
recovers and sends a new message on the channel, the reception of this new message before m 
(our channels are not FIFO) will lead to the discartion of m when it is eventually received (since 
otherwise messages could be reordered). 

This postponing of nondeterministic choices suggests that we at one point have to rely on a 
backward simulation to prove correctness of the low-level protocols. In a first attempt, a timed 
backward simulation was proved directly from the Clock-Based Protocol C to S (or rather the 
patient version of S). A lot of this work would have had to be repeated in a backward simulation 
from the Five-Packet Handshake Protocol H to 5, so after having designed the Generic Protocol 
G, we proved a backward simulation from G to $, and could then do with a timed refinement 
from C to patient(G) and a refinement from H to G. 

Still, the proof from G to $ was very large and comprehensive. It is our experience that 
backward simulations are generally difficult to deal with, mainly because they are not so intuitive 
as forward simulations. This observation led us to try to “limit” the backward simulation to 
a development step as small as possible. Generally, one should always try to find steps of 
development that are intuitive, and remember that a series of steps (with proofs) are generally 
easier to comprehend than is one big proof, even though the combined length of the small proofs 
might exceed the length of the big proof. 

So, as an intermediate level between S and G we came up with the Delayed-Decision Spec- 
ification D, which looks very much like $, but instead of deleting messages between crash and 
recovery, D marks arbitrary messages, and marked messages can then be lost at any point. D 
also deals with postponing of losing (i.e., changing to false) the status as the result of a crash. 
When we describe the steps of D, we will further explain the differences between $ and D. 

It should be noted, that even though we postpone the decision about which messages to lose, 
only messages which were in the system between crash and recovery can be lost. A system that 
did not satisfy this restriction could not, of course, implement 5. 


The rest of this chapter is organized as follows. First, in Section 7.1, we present D and then, in 
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Section 7.2, we prove that D correctly implements 5. 


7.1 The Specification of D 


We specify D = (Ap, Lp) as a live I/O-automaton using the notation introduced in Chapter 4. 
Lp will be specified implicitly by the environment-free liveness formula Qp for Ap. 


7.1.1 States and Start States 


The marks we put on messages and status are taken from the following set: 


Flag = {0K,marked} 


Tnivially 


queue (Msg x Flag)* E The list of messages in the system. Each 
message has an associated flag. If the flag 
value is marked, the message might be lost 
in a subsequent drop(/) action. 


TEC, Bool false true iff the sender has crashed and not yet 
recovered. 


TECp Bool false true iff the receiver has crashed and not yet 
recovered. 


status Stat x Flag (false, OK) Indicates the status of the last message sent. 
If the associated flag is marked, the status 
might be changed to false in a subsequent 
drop(I) action. 


We use the normal record notation to extract components of a value or variable. For instance, 
status.stat and status.flag extract the status value and status flag from status. 

We say that status is marked if status.flag = marked, and correspondingly an element e of 
queue is marked if e.flag = marked. If en element of queue or the status is not marked, it is said 
to be OK or “not marked”. 


7.1.2 Actions 


The input and output actions, i.e., the user interface, of Ap is, of course, the same as for Ag. 
Ap has the internal actions mark(I), unmark(1), and drop(J). 


Input: 
send_msg(m), m € Msg 
crash. 
crash, 
Output: 
receive_msg(m), m € Msg 
ack(), 6 € Bool 
TECOVER g 
TECOVEry 
Internal: 
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mark(I), fC 

I 

drop(1), 1 CN 


7.1.3 Steps 


Here we present the steps of Ap. An explanation of the steps is offered below. 


send_msg(m) receive_msg(m) 
Effect: Precondition: 
queue := queue ~ (m, OK) queue #eN 
status := (?, OK) (head(queue)).msg = m 
Effect: 
ack(b) queue := tail( queue) 
Precondition: if queue =e A status.stat = ? then 
status.stat = b status.stat := true 
Effect: 


status.flag = OK 


crash. crash, 
Effect: Effect: 
réCs := true recy := true 
mark (I) 
Precondition: 
(recs = true V rec, = true) A I C dom(queue) 
Effect: 
queue := mark( queue, I) 


optionally status.flag := marked 


reCOVEr g recovery 
Precondition: Precondition: 
rec, = true recy = true 
Effect: Effect: 
recs := false recy := false 
unmark (I) 
Precondition: 
I C dom(queue) 
Effect: 
queue := unmark( queue, I) 


optionally status.flag := OK 


drop(I) 
Precondition: 
IC {t |i € dom(queue) A queue[t].flag = marked} 
Effect: 
if queue #€ A masida( queue) € I then 
status := (false, OK) 
else if status. flag = marked then 
optionally status := (false, OK) 
queue := delete( queue, I) 


In the step rule for drop we use the function delete, which was defined in Chapter 6 and used in 
the definition of lose(I) at the 5 level. The precondition of drop(/) guarantees that only marked 
messages are deleted. The step rule for mark uses a function mark, which is intended to mark 
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messages with indices in J. Formally, for any queue g € (Msg x Flag)* and any set I C dom(q), 
define 


mark(q,I) = ((if i € I then (q[?].msg,marked) else q[t]) | 1 € dom(q)) 
Similarly, the step rule for unmark uses the function unmark defined as 
unmark(q,1) = ((if i € I then (q[i].msg, 0K) else g[é]) | i € dom(q)) 


Furthermore, note that when a new message is put into queue (by send_msg(m)), the message 
and status get the flag OK to indicate that they cannot be lost (yet). In the definition of the 
receive_msg(m) steps it is seen that a message might be successfully delivered to the receiver 
even though it is marked. This is because a marked message only has the possibility of being 


deleted. 


Recall from the definition of S that there are two ways in which status can be lost (i.e., get a 
status value of false), and both ways are described in the definition of lose() in Ag: 1) if the 
element at the end of the queue is deleted, then the status is required to be lost, and 2) in any 
lose(I) step the status may be lost. 

In Ap a status flag of marked corresponds to point 2), i.e., that status may be lost. In 
the mark(I) steps of Ap permission is given to lose some messages and maybe status. Then 
in drop(1) steps of Ap, which does the actual deleting performed by lose(I) in Ag, status is 
required to be lost if the element at the end of queue is deleted, even though status is OK. This 
corresponds to point 1) above, where status is required to be lost. Steps labeled by drop(J) is, 
of course, always allowed to lose a marked status. 


The effect clause in the definition of the ack(6) steps is explained as follows: suppose status.stat = 
? and that status.flag has been changed to marked during a crash (by mark(/)). In a subse- 
quent receive_msg(m) step that empties queue, status.stat is changed to true which enables 
an ack(true) action. After the receive_msg(m) step, status = (true,marked), so there is still 
a possibility of losing status. However, once a positive acknowledgement has been issued, the 
system must not lose status and start issuing negative acknowledgements. Remember from the 
5 level that the system is only allowed to change its mind in this respect during a crash. Thus, 
by changing status.flag to OK in the ack steps, we disallow this change of mind. Note, that it 
would be too restrictive to change status to (true,0K) in receive_msg(m) since we want Ap to 
be as nondeterministic as possible, to allow as many implementations as possible. 

Another point where we have made Ap very nondeterministic is in the way messages (and 
status) are marked and deleted. In a mark(I) step some messages are marked and in an 
unmark(I) step, which can happen at any time, some of the marked messages can be made 
OK again, and finally in a drop() step, some of the marked messages are deleted. 

Here, again, the point is that we want Ap to be as nondeterministic as possible. Of course 
the effect of marking some elements could be obtained by a “deterministic” mark that marks 
everything followed by unmark(I). However, when performing simulation proofs from lower 
levels of abstraction, it is desirable, for clarity, to have as nondeterministic actions of Ap as 
possible. Thus, by removing nondeterminism from Ap, which could not jeopardize its correctness 
with respect to Ag, we might rule out some implementations and make the correctness proofs 
of other implementations more cumbersome. 
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7.1.4 Liveness 


As at the S level, we specify liveness in terms of fairness. Specifically, the liveness condition Dp 
at the D level will be specified implicitly as an environment-free liveness formula Q@p for Ap. 
Qp will be stated as a conjunction of four weak fairness formulas, two of which have associated 
forcing conditions. We do not require fairness on the actions mark(I), unmark(1), and drop(I). 
Informally, we have the four weak fairness conjuncts: 


1. ack(6) actions 


Forcing condition: rec, = rec, = false 


2. receive_msg(m) actions 


Forcing condition: rec, = rec, = false 
3. TECOvEr, 


4. recover, 


This ensures the same liveness as at the 5 level. Formally, let 


Coi = {ack(true), ack(false)} 

Coo =  {receivemsg(m)|m € Msg} 
Cos = {recover,} 

Cpa = {recover,} 


Then the formalization of Qp is 


Qn = 


Cpa, recs = false \ rec, = false) A 
Cp, recs = false \ rec, = false) A 


By Lemma 4.7, Qp is an environment-free liveness formula for Ap. Thus, D = (Ap, Dp) is a live 
I/O automaton. Furthermore, by Lemma 4.8, Qp is stuttering-insensitive. 


This concludes the Delayed-Decision Specification of the at-most-once message delivery problem 
and attention is now turned towards proving that D correctly implements 5S. 


7.2. Correctness of D 


In this section we prove that D = (Ap, Lp) is a correct implementation of our specification 
S = (As, Ls). First we give some invariants of Ap. Then we prove, by means of an image-finite 
backward simulation, that Ap safely implements Ag, and finally we use this simulation result to 
prove that D correctly implements S$. 


7.2.1. Invariants 


We only need one invariant in the proof. The invariant should be understood as the conjunction 
of the two parts. 
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Invariant 7.1 


1. if status.stat = ? then queue # € 


2. if status.stat = true then queue = ¢€ 


Proof 


By a simple inductive argument, it is easily proven that all reachable states of Ap satisfy the 
two parts of the invariant, so we omit the proof here. At the lower levels of abstraction we will 
give examples of proofs of more interesting invariants. 


Below, we refer to this invariant by Jp. 


7.2.2 Safety 


To show that Ap safely implements As, we show the existence of an image-finite backward 
simulation from Ap to Ag with respect to some invariants. However, before we can do this we 
need a few preliminary definitions and lemmas. 


Below we let gp be a queue at the D level, i-e., qo € (Msg x Flag)*, and let gg be a queue at the 
S level, i.e., gg € Msg”. 


Definition 7.2 (Explanation) 


Define an explanation from gg to gp to be any mapping f : dom(qs) — dom(qp) that satisfies 
the following four conditions 


1. f is total 
2. f is strictly increasing 
3. Vi € dom(qp) \ rng(f) : qpo[?].flag = marked 
4. Vi € dom(qs) : qo[f(2)].msg = qs[t] 
a 
Basically, if there exists an explanation from gg to gp, this means that gg can be obtained from 


gp by first deleting some of the marked elements of gp and then removing the flags from the 
remaining elements. 


Lemma 7.3 


Let f be an explanation from qs to gp. Then |qs| < |gp|- 


Proof 


Suppose |qs| > |gp|. Then it is impossible to find a mapping from dom(qs) to dom(qp) that is 
total and strictly increasing, thus Conditions 1 and 2 of Definition 7.2 are violated. Hence, we 
can conclude |gs| < |gp]. 
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Now, define #9x(qp) to be the number of elements e of gp with e.flag = OK. Thus, formally 
#ox(qo) = lan t (Msg x {0K})| 


Lemma 7.4 


Let f be an explanation from qs to qn. Then |qs| > #oxK(9gp).- 


Proof 


Suppose |qs| < #ox(qp). Then Conditions 1 and 2 of Definition 7.2 give us that |rng(f)| = 
las|(< #oK(qp)), so there must exist indices i in gp such that gp[i].flag = OK and i € rng(f). 
But this contradicts Condition 3 of Definition 7.2. Hence, we can conclude |gs| > #oK(4)- 


We are now ready to define a relation Bps over states( Ap) x states( Ag). In Lemma 7.11 below 
we prove that Bps is an image-finite backward simulation from Ap to As. 

However, before we give the actual definition of Bps, it might be appropriate to discuss how 
to define a backward simulation in general. What states should be related? Let us give some 
guide-lines in terms of Ap and Ag in this example. 

Recall that a backward simulation is needed when an implementation postpones some non- 
determinism of the specification. The deletion of messages during a crash in Ag can in Ap be 
postponed until after recovery, which indicates that we need a backward simulation from Ap to 
Ag. (It is impossible to find a forward simulation from Ap to Ag. See, e.g., [LV92] for details.) 
This situation is shown—in a simplified way—in the following picture. 


TeECOver gs 


U13 —_— 23 
lose 
1 reCOVEF 5 


Ose 
S level uo. > 412 ———>  &22 


lose reCOVEl « 
U1. —_—— 21 
$33 
drop 
mark rECOVET s drop 


D level sg —__—————_ 83. 82. ———— 832 
i 
$31 


The mark step of Ap marks some messages, and after recovery some of the marked messages 
can be deleted by the nondeterministic drop steps. In this simplified example we assume that 
there are three ways of deleting messages, leading to states 531, $32, and $33.1 In Ag this scenario 
corresponds to lose having the “same” three ways of deleting messages, leading to states w1,, 
uy 2, and t3, followed by recovery. 


‘When dealing with two levels of abstraction, we always let s range over the states of the concrete level and 
u over the states of the abstract level. 
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It seems fairly intuitive that Bps should relate s3; to ua; for 1<27< 3. But what about 55? 
Well, s. is the state right after Ap has recovered, so it should be related to states after Ag has 
recovered. Thus, we are down to Uo, Ue2, and tw23. Now the point is that s. actually corresponds 
to all of these states. In some sense Bpg offers an explanation of the nondeterminism occurring 
after sy by saying that this nondeterminism corresponds to some previous nondeterminism of 
Ag, which has led to one of the states u2,, wo2, OF U3. 

To check that Bps is a backward simulation from Ap to Ag we have, among other things, 
to verify that each step of Ap corresponds to a sequence of steps of As with the same trace. 
More specifically, consider, e.g., the step (52, drop, 532.) of Ap. According to Condition 3 of 
Definition 5.3, we have to verify that for each state of Ag that is related to 539, here only we, 
there exists a state u of Ag such that there is a sequence of steps from w to ue. with an empty 
trace (since drop is internal). But here we can just choose u to be w22. This makes the sequence 
of steps in Ag empty which certainly has an empty trace. 

For s, we can use similar arguments and find that s, should be related to all of the states 
Ui1, U2, and ui3. Now, consider the step (s), recover,,s_) of Ap. Again, we have to consider 
every state that is related to s,. Let this state be uo; for some arbitrary 1 <i < 3. We then have 
to find some state uw related to s,; such that there is a sequence of steps from u to wo; with the 
trace recover,. But here we just choose u = w;, and since, for all 1 <7 < 3, (uy, recover,, Ua;) 
is a step of Ag, we are done. 

Finally, of course, 5) should be related to uo. 


The above example offers some guide-lines when defining backward simulations, and even though 
the real Bps from Ap to Ag is more complicated—mainly because of the nondeterminism involved 
with the status and the connection between queue and status—the recipe is the same: 


To any state s of Ap, we have to relate all states u of As that could have resulted 
from some nondeterminism of Ag that “corresponds” to nondeterminism that may 
happen after state s of Ap. 


Of course, one has to use ones intuition about the safe I/O automata in question in order to 
identify the “corresponding” nondeterminism. 


Bps can now be defined and motivated. 


Definition 7.5 (Image-Finite Backward Simulation from Ap to As) 


If s € states( Ap) and u € states( Ag), then define that (s,w) € Bps if there exists an explanation 
f from u.queue to s.queue such that the following conditions hold: 


1. w.rec, = s.rec, and u.rec, = 8.rec, 


2. u.status € 
if s.status.flag = OK A (s.queue = ¢ V (last(s.queue)).flag = OK) then {s.status.stat} 


else {s.status.stat, false} 
3. if u.status =? A s.queue # € then mazidz(s.queue) € rng(f) 


We say that an explanation from u.queue to s.queue is a valid explanation from u to s provided 
that Conditions 1-3 are satisfied. 
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Note, that (s,w) € Bps iff there exists a valid explanation from wu to s. 


The requirement that there has to be an explanation from u.queue to s.queue in order for 
(s,u) € Bps is a generalization of the example above. Thus, all states u related to s have queues 
that can be obtained by deleting some marked messages from s.queue and removing the flags 
from the remaining elements. 

Condition 1 gives the straight-forward correspondence between the rec flags of Ap and As. 

Condition 2 deals with the status. In Ap there are two ways of losing status (i.e., changing 
status.stat to false), and both situations are described in the specification of the drop steps of Ap: 
either the element at the end of queue gets deleted, in which case status must be lost, or status 
is marked, in which case status may be lost. Alternatively, we can say that if status.flag = OK 
and either queue is empty or its last element is OK, the status cannot be changed by a drop 
step. Thus, in this case we are not in a situation where Ap is “waiting” to perform some 
nondeterminism on status, which has already been performed by Ag. If, on the other hand, 
status is marked or the last element on queue is marked, drop may lead to loss of status, and 
this corresponds to a loss at the $ level, which has already occurred in a lose step of S. Thus, 
in this situation Bps should allow the corresponding state at the S level to have status = false. 
This explains Condition 2. 

Finally, Condition 3 in the definition of Bps is a consistency condition between the explana- 
tion f and the value chosen for u.status. The condition should intuitively ensure that whenever 
the last element of s.queue is not in the range of f,i.e., when f states that u describes a situ- 
ation where the last element of queue has been lost, then u.status must reflect this by having 
the value false. Thus, the condition should limit the number of legal combinations of u.queue 
and u.status due to the fact that these values are not always independent. The condition could 
initially be written as 


if s.queue # €¢ A mavidz(s.queue) ¢ rng(f) then u.status = false 
Taking the contrapositive of this condition gives us 
if u.status # false then s.queue = ¢ V maxidr(s.queue) € rng(f) 


Now, if u.status = true then Condition 2 gives us that also s.status.stat = true. Invariant 7.1 
Part 2 then implies that s.queue is empty. Thus, if u.status = true, the condition is trivially 
satisfied. So we only need to deal with the case where u.status = ? and this is exactly Condition 
3 of the definition in a slightly rewritten form. 


Note, that in defining Bps we have used our intuition about Ag and Ap. It is not at all sure that 
a first attempt to define a simulation relation is correct. However, any errors in the definition 
will be caught in the subsequent simulation proof and lead to a revised definition, and so on. 
For instance, the consistency condition (Condition 3) in the definition of Bps was added during 
a proof attempt that failed. In Lemma 7.11 below we prove that Bpgs is in fact an image-finite 
backward simulation from Ap to Ag. 


The following lemmas make the main simulation proof shorter. 


Lemma 7.6 


Let s € states(Ap) and q € Msg” such that there exists an explanation from q to s.queue. Then 
there exists a state u © states(Ag) with u.rec, = s.recy, U.TeEC, = S8.TeC,, U.queue = gq, and 
(s, u) E Bog. 
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Proof 


Let f be an arbitrary explanation from g to s.queue and let u.rec, = s.rec,, U.TeC, = S.TCCp, 
and u.queue = gq. We must show that we can define u.status such that Conditions 1-3 of 
Definition 7.5 are satisfied. 


Condition 1 is trivially satisfied. 


We now consider cases, in each case defining w.status and showing that Conditions 2 and 3 are 
satisfied. 
1. s.queue =€ 


Define u.status = s.status.stat. Then Conditions 2 and 3 are vacuously satisfied. 


2. s.queue Fé 


(a) (last(s.queue)).flag = marked 
Define u.status = false. This satisfies Conditions 2 and 3, the latter vacuously. 
(last(s.queue)).flag = OK 


Define u.status = s.status.stat. Then Condition 2 is vacuously satisfied. 


(b 


— 


Now, assume that mazidx(s.queue) € rng(f). Then Condition 3 of Definition 7.2 of 
an explanation says that s.queue|[mazidz(s.queue )|.flag = marked which is the same 
as (last(s.queue)).flag = marked, but this contradicts the assumptions in this sub- 
case. Hence we have that mazidr(s.queue) € rng(f). Thus Condition 3 is satisfied. 


Now, define the total function marqueue : (Msg x Flag)* + Msg” such that for any queue 
qp in the domain, maxqueue(qp) is defined to be the queue gg obtained by removing all flag 
components from gp. Formally, we have 


qs = maxqueue(qp) iff ~— gs) = |qp|_ and = -W2 € dom(qp) : gs[*] = qn[t].msg 


Lemma 7.7 


The identity mapping f from dom(qp) to dom(qp) is an explanation from maxqueue(qp) to gp. 


Proof 


We check Conditions 1-4 of Definition 7.2 of an explanation. Since the identity mapping is both 
total and strictly increasing Conditions 1 and 2 are satisfied. Condition 3 is vacuously satisfied 
since rng(f) = dom(qp). From the definition of maxqueue we directly see that also Condition 4 
is satisfied. 


Lemma 7.8 


Let s € states(Ap). Then there exists a state u € states( Ag) with u.rec, = 8.reC,, U.TEC, = S.TEC,, 
and u.queue = maxqueue(s.queue), such that (s,u) € Bos. 
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Proof 


Let gs = marqueue(s.queue). Then by Lemma 7.7 there exists an explanation (namely the 
identity mapping) from gs to s.queue. Lemma 7.6 then gives us the existence of a state w with 
U.TEC, = S.TEC,, U.TEC, = S.Tec,, and u.queue = gg such that (s,uw) € Bos. That suffices. 


Corollary 7.9 
Let s € states(Ap). Then there exists a state u € states( Ag) such that (s,u) € Bos. 


Proof 
Immediate from Lemma 7.8. 


We state the following trivial lemma without proof. 


Lemma 7.10 


Let qp be an element of (Msg x Flag)*. Then, any element qs of Msg”, such that there exists 
an explanation from qg to qp, can be obtained from maxqueue(qp) by deleting some elements. 


We can now state and prove the main result of this section, namely that the relation Bps 
defined in Definition 7.5 is an image-finite backward simulation from Ap to Ag (with respect to 
Ip (Invariant 7.1) and true). The style of the proof is careful mathematical reasoning. 


Lemma 7.11 


Ap <B Ag via Bogs. 


Proof 


We prove that Bps is an image-finite backward simulation from Ap to Ag with respect to Ip 
and true. We first show that Bpg is image-finite and then check the three conditions (which we 
call nonemptiness, base case, and inductive case, respectively) of Definition 5.3. 


Image-Finiteness 


Let s be an arbitrary state of Ap. We must show that there exists only finitely many states 
u of Ag such that (s,u) € Bps. Since rec,, rec,, and status can only take on finitely many 
values in Ag these variables cannot give rise to problems. It now remains to be shown that for 
a fixed but arbitrary s also queue (in S) can only take on finitely many values. For (s,w) to 
be in Bps there must exist an explanation from u.gueue to s.queue. Lemma 7.3 gives us that 
|u.queue| < |s.queue|, thus there are only a finite number of lengths to choose from (since s.queue 
is a finite queue). Also, there exists only a finite number of mappings (explanations) between 
two finite domains. Condition 4 of Definition 7.2 finally gives us that the elements of the possible 
u.queue values are uniquely determined by s.queue and the (finitely many) explanations. Hence, 
u.queue can only take on finitely many values given s. That suffices. 


98 7. Delayed-Decision Specification D 


Nonemptiness 
Corollary 7.9 immediately gives the result. 
Base Case 


Let so be the (unique) start state of Ap. Then if (s,u) € Bps, then u.rec, = s.rec, = false, 
u.rec, = s.rec, = false, u.status = s.status.stat = false (since s.status.flag = OK and s.queue = 
€), and u.queue = e (since the existence of an explanation from w.queue to s.queue and the 
fact that s.queue = € implies that w.queue = ¢.) Thus, w is the unique start state of As. That 
suffices. 


Inductive Case 


Assume (s,a,8’) € steps(Ap) such that s and s’ satisfy fy (Invariant 7.1), and let wu’ be an 
arbitrary state of Ag such that (s’, u’) € Bps. Below we consider cases based on a (and sometimes 
sub-cases of each case) and for each (sub)case we define a finite execution fragment a of Ag 
with Istate(a) = w’, (s, fstate(a)) € Bps, and trace(a) = trace(a). In this particular proof all 
execution fragments will be of length zero or one. Thus, in each (sub)case we will either 


e define an action 6 € acts( Ag) and a state u € states( Ag), such that (u,b, u’) € steps(As), 
(s,u) € Bps, and trace(b) = trace(a), or 
e show that (s,u’) € Bps and a is internal. 
In the former case, we show that (u,b,w’) € steps(As) by showing that all four state variables 
of Ag are related in wu and w’ according to the definition of the 6 steps of Ag. 


In the proof, when we refer to Conditions 1-3, we mean Conditions 1-3 of Definition 7.5 of Bps 
unless otherwise specified. 
a = send_msg(m) 


In this case we show that we can define u such that (u,send_msg(m), wu’) € steps(As) and 
(s,u) € Bos. Clearly the step has the right trace. 


We have s’.queue = s.queue “(m, 0K) and s’.status = (?,0K). Lemma 7.4 implies w’.queue # €. 


Define w. rec, = s.Trec, 
UTEC, = S.TEC, 
u.queue = init(u’.queue) 


First we find an explanation from u.queue to s.queue. Let f’ be a valid explanation from 
wu’ to s’. (Such a valid explanation exists since (s’,u’) € Bps). Since last(s'.queue).flag = 
OK, we have from Lemma 7.4 and Conditions 1-3 of Definition 7.2 of an explanation that 
f'(maxidz(u'.queue)) = maxidx(s'.queue). Then f = f’ | dom(u.queue) is clearly an expla- 
nation from u.queue to s.queue. 


Now, by Lemma 7.6, define u.status such that (s,u) € Bps. 
It remains to show that (u, send_msg(m), u’) € steps( As): 


rec, and rec,: 
From the definition of the send_msg(m) steps of Ap, the definition of wu, and the fact that 
(s',u’) € Bos, we have that w’.rec, = s'.rec, = s.rec, = u.rec, and correspondingly for rec,. 
This is as required by the definition of the send_msg(m) steps of As. 
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status: 
Since (s’,u’) € Bps, Condition 2 implies that u’.status = ?. No matter what the value of 
u.status is, this is as required by the definition of the send_msg(m) steps of As. 

queue: 
We have u’.queue # € (by Lemma 7.4) and last(u’.queue) = m (by use of Definition 7.2 of 
an explanation). Then, by definition, we have u’.quewe = init(u’.queue) * last(u'.queue) = 
u.queue °m. Again, this is as required by the definition of the send_msg(m) steps of As. 


a= crash, 

Define u.rec, = s.Trec, 
U.TEC, = uw'.rec, 
u.status = w’'.status 
u.queue = uw'.queue 


Then it is easy to see that (s,w) € Bps (any valid explanation from w’ to s’ is also a valid 
explanation from u to s) and that (u, crash,,u’) € steps( As). 


a= crash, 


Similar to the case a = crash,. 


a = receive_msg(m) 
In this case we define u such that (u, receive_msg(m), u’) € steps( As) and (s,u) € Bps. Clearly 
the step has the right trace. 


From the definition of the receive_msg(m) steps of Ap we have that s.rec, = s’.rec,, s.rec, = 
s'.rec,, s.queue # € with (head(s.queue)).msg = m and s’.queue = tail(s.queue). 


Define u.rec, = s.Trec, 
UPC» = s.rec, 
u.queue = m°u'.queue 


We first find an explanation from u.queue to s.queue. Let f’ be any valid explanation from wu’ 
to s’ (we know it exists), and define f in the following way: 


FalG+)e (FM+ 1 [te dom(f)]U [0 0] 


Intuitively f relates the same elements in u.queue and s.queue that were related by f’ in u’.queue 
and s’.queue (these elements all have their indices increased by one because of the new elements 
at the head of the queues), and relates these new messages m. Based on the fact that f’ is an 
explanation from u’.queue to s’.queue, it is easy to check that f is an explanation from u.queue 
to s.queue. 


We consider cases, in each case defining w.status, showing (s,u) € Bps by showing that Condi- 
tions 2-3 hold (Condition 1 clearly holds) and showing that (uw, receive_msg(m), u’) € steps( As). 
For the latter part it is easy to see that a receive_msg(m) step is enabled in w and that rec,, 
rec, and queue are handled correctly. So all we need to do is to show that also status is handled 
correctly in the receive_msg(m) step of As. 
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1. s.status.stat = true 


Invariant 7.1 Part 2 implies that this situation cannot occur. 
. s.status.stat = false 
Define u.status = false. 
Then clearly (s,u) € Bps (Conditions 2 and 3 are vacuously satisfied) 
status: 
Since s.status.stat = false, we have s'.status = s.status, so u’.status = false. Leaving 
status = false unchanged is permitted by the definition of the receive_msg(m) steps in 
As. 
. s.status.stat = ? 


(a) u’.queue # € 


Then also s’.queue # ¢ (by Lemma 7.3) so from the definition of receive_msg(m) in 

Ap we have s’.status = s.status. 

Define u.status = u'. status. 

Condition 2: 
Since (s’,u’) satisfies Condition 2, also (s, u) satisfies that condition. (Neither the 
emptiness of queue, status flag, nor the flag of the last element in queue are changed 
in the step in Ap). 

Condition 3: 
Assume that u.status(= u’.status) = ?. Since s.queue # ©, we must show that 
maxidz(s.queue) € rng(f). Since s’.queue # ¢, and (s’,w’) and f’ satisfy Condition 
3, we have maxidx(s’.queue) € rng(f’), so from the construction of f, it is easy to 
see that mazidz(s.queue) € rng(f). 

status: 
Leaving status unchanged is as required by the definition of receive_msg(m) in Ag 
since we assume that u’.queue # €. 


(b) u’.queue = € 


i. s’.queue = € 
Then the definition of receive_msg(m) in Ap implies that s’.status.stat = true and 
8’. status. flag = s.status.flag. Then, by definition of Bps, u’.status is either true or 
false. We consider cases. 
A. s’.status.flag = OK or (s’.status.flag = marked and w’.status = true) 
If s’.status.flag = OK, then by Condition 2 we also have wu'.status = true since 
8’ .status.stat = true. 
Define w.status = ? (= s.status.stat). 
Condition 2: 
Vacuously satisfied by (s, w). 
Condition 3: 


Since s’.queue = €, we have |s.queue| = 1. Now, since f(0) = 0, we have 
maxidx(s.queue) € rng(f) as required. 
status: 


Changing status from ? to true when w'.quewe = € is as required by the defi- 
nition of receive_msg(m) in As. 
B. s'.status.flag = marked and u’.status = false 
Define u.status = false. 
Condition 2: 
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Since s.status.flag = s'.status.flag = false, we have that (s, u) satisfies Condi- 
tion 2. 
Condition 3: 
Vacuously satisfied. 
status: 
Leaving status = false unchanged is allowed by receive_msg(m) in As. 
ii. s’.queue # € 
The definition of receive_msg(m) in D implies s’.status.stat = s.status.stat = ? 
and s’.status.flag = s.status.flag. Since u’.queue = ¢, s’.queue # ¢, and (s’, wu’) 
and f’ satisfy Condition 3, we get that w’.status 4 ? (f’ must be empty). Note, 
that this is one of the two places in the entire proof where we need the consistency 
condition (Condition 3). Condition 2 now gives us that u’.status = false and that 
either s’.status.flag = marked or (last(s'.queue)).flag = marked. 
Define u’.status = false. 
Condition 2: 
Since s.status.flag = s'.status.flag, (last(s’.queue)).flag = (last(s.queue)).flag, 
and one of these flag values is marked, we see that (s, u) satisfies Condition 2. 
Condition 3: 
Vacuously satisfied. 
status: 
Leaving status = false unchanged is allowed by the definition of receive_msg(m) 


in As. 


a = ack(b) 


In this case we define u such that (uw, ack(b), u’) € steps(As) and (s,u) € Bps. Clearly the step 
has the right trace. 


From the definition of ack(b) in Ap, we have that s.status.stat = 6 and that s’ = s except that 
s’ and s may differ on the value of status.flag, which is set to OK in the step. 


We consider cases based on the value of 6. 


1. 


b = false 

Then u’.status = false. 

Define u = wu’. 

It is now easy to see that (s,u) € Bos. (The fact that s and s’ may differ on the value of 
status.flag could only cause troubles in Condition 2 but this is seen not to be the case since 
s.status.stat = false implies that the only choice for u.status is false as we have defined it 
to be.) 


Now, since u’ = u, we have u.status = false, Thus, an ack(b) step is enabled in w. Again 
since u = u’, we now see that (u, ack(b), u’) is a step of Ag as required. 


. b= true 


Since s.status.stat = s’.status.stat = true, Invariant 7.1 Part 2 gives us that s’.queue = ¢ 
and s.queue = ¢. Furthermore, since s’.status.flag = OK, we get from Condition 2 that 
u.status = true. 


Define u = w'. 


As in the previous case clearly (s,u) € Bps and (u, ack(b), u’) € steps( As). 
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a = recover, 


Define u.rec, = false 
U.TECp = w'.rec, 
u.status = w'.status 
u.queue = u'.queue 


Since w.rec, = s.rec, = false, it is easy to see that (s,u) € Bps (any valid explanation from w’ 
to s’ is also a valid explanation from w to s) and that (u, recover,,u’) € steps( Ags) (and clearly 
has the right trace). 


a = recover, 


Similar to the case a = recover,. 


a = mark(1) 
In this case we define u and I’ such that (u, lose(I’), uw’) € steps( As) and (s,u) € Bps. Clearly 
the step has the right trace (the empty trace). 


From the definition of the mark steps in Ap we have s'.rec, = s.rec,, s'.rec, = s.rec,, and either 
S8.rec, = true or s.rec, = true. 


Define w.rec, = s.Trec, 
UPC» = s.rec, 
u.queue = maxqueue(s.queue) 
u.status =  s.status.stat 


By Lemma 7.7 the identity mapping f is an explanation from u.queue to s.queue, and it is easy 
to show that f is a valid explanation from u to s. Thus, (s,u) € Bos. 


To show that (wu, lose(I’),u’) € steps( As), we first observe that since (s,u) € Bps we have 
u.rec, = true or u.rec, = true, so a lose(I’') step is enabled in w. 


rec, and rec,: 
u'.rec, = s'.rec, = s.rec, = u.rec, and similarly for rec,. This is as required by the definition 
of lose(I’) in Ag. 

queue: 
First observe that marqueue(s.queue) = maxqueue(s’.queue). Then, since by definition we 
have u.queue = maxqueue(s.queue), Lemma 7.10 implies that u’.queue can be obtained from 
u.queue by deleting some (possibly zero) elements. Thus, we can define J’ accordingly, and 
this is as required by the definition of lose(I’) in Ag. 

status: 
First note that since we might have s’.status.flag = marked, we also might have wu’.status = 
false by Condition 2, but since lose(/’) can always change status to false in Ag, this situation 
does not cause troubles. 


The situation that could cause troubles is if u’.status # false but the lose(I’) step is required 
to change status to false because the element at the end of u.quewe must be deleted in order 
to treat queue correctly. We must show that this situation is impossible. 

Assume that u’.status # false. Then Condition 2 and the definition of mark(I) in Ap give 
w'.status = s'.status.stat = s.status.stat # false. We consider cases. 
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1. u.status = s’.status.stat = s.status.stat = true. 
Invariant 7.1 Part 2 implies s.queue = s'.queue = ¢. Then Lemma 7.3 implies that 
u.queue = u’.queue = €. Thus I’ = 9. That suffices. 
2. u.status = s’.status.stat = s.status.stat = ?. 
(a) s.queue =€ 
Similar to Case 1. 
(b) s.queue # € 
Then Condition 3 and Definition 7.2 imply f(mazidx(u.queue)) = maxidx(s.queue). 
It is now easy to see that u’.queue can be obtained by deleting some elements, but 
not the element at the end, from u.queue. That suffices. 


a = unmark (1) 


In this case we show that wunmark(I) in Ap corresponds to an empty step in Ag (remember that 
unmark(1) is internal). Thus, we show that (s, wu’) € Bps. 


From the definition of the unmark(1) steps of Ap, we have that s’.queue is obtained from s.queue 
by changing some (maybe zero) flag values from marked to OK. Now, let f’ be a valid explanation 
from wu’ to s’. Then by Definition 7.2 it is easy to see that f’ is also an explanation from u’.queue 
to s.queue. (The only interesting case is Condition 3 of Definition 7.2 but since messages that 
are marked in s’.queue cannot be OK in s.queue, this case is easily checked). 


We show that f’ is a valid explanation from wu’ to s by checking Conditions 1-3. 


Condition 1: 
This condition is satisfied since the unmark(I) step does not change rec, and rec,. 
Condition 2: 
The unmarking of status and message flags might lead to the requirement that wu’.status = 
s'.status.stat (by Condition 2). But then obviously also (s, u’) satisfies Condition 2 since both 
the “then” and the “else” in this condition allow w’.status = s.status.stat(= s’.status.stat). 
The important thing to note here is that unmark(/) cannot lead from a situation where the 
“then” clause must be chosen to a situation where the “else” clause must be chosen. 
Condition 3: 
Since Condition 3 does not mention any flag values, it is seen that (s,u’) and f’ satisfy this 
condition. 


a = drop(I) 


In this case we show that drop corresponds to an empty step of Ag, i.e., that (s, u’) € Bpgs (recall 
that drop(Z) is internal). 


Let f’ be an arbitrary valid explanation from wu’ to s’. We now construct an explanation f from 
u'.queue to s.queue: I contains the indices of the elements of s.qgueue that were deleted in the 
drop(I) step. Then |dom(s'.queue)| = |dom(s.queue) \ I|. Now, let g be the (unique) bijective, 
strictly increasing mapping from dom(s'.queue) to dom(s.queue) \ I. Informally g maps indices 
of elements in s’.queue to the indices the same elements had in s.queue. 


Define f = go f’. To check that f is an explanation from wu’.queue to s.queue, we check 
Conditions 1—4 of Definition 7.2: 


Conditions 1-2 of Definition 7.2: 
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Since f’ is total and strictly increasing from dom(u’.queue) to dom(s’.queue) and g is total and 
strictly increasing from dom(s’.queue) to dom(s.queue) \ I, f is total and strictly increasing 
from dom(u'.queue) to dom(s.queue ). 
Condition 3 of Definition 7.2: 

We have that dom(s.queue) \ rng(g 0 f’) = TU g7'(dom(s’ queue) \ rng(f’)). This informally 
states if an element of s.queuve is not “hit” by f then this is because either the element is 
one of the elements that are deleted in the drop(/) step or because the “corresponding” (by 
g) element in s’.queue is not “hit” by f’. Now, all elements in s.queue with indices in J are 
marked (by the precondition of drop(/)). Since f’ is an explanation, all elements of s’.queue 


with indices in dom(s’.queue) \ rng(f’) are marked, and since g and then also g7' 


maps the 
index of an element to the index of the same element, we have that all elements of s.queue 
with indices in g~'(dom(s’.queue \ rng(f’))) are marked. That suffices. 

Condition 4 of Definition 7.2: 
By the fact that f’ is an explanation (and therefore satisfies Condition 4) and the fact that 
g maps the index of an element to the index of the same element, it directly follows that f 
satisfies Condition 4 of Definition 7.2. 


Thus, f is an explanation from wu’.queue to s.queue. 


It now remains to show that f is a valid explanation from wu’ to s, i.e., we must check Conditions 

1-3. 

Condition 1: 
Condition 1 is clearly satisfied (since neither rec, nor rec, are changed in the drop(1) step and 
(s',u’) € Bps). 

Condition 2: 
We consider the ways status can change in the if-statement in the definition of the drop(J) 
step. 
Assume that the element at the end of s.queue is deleted in the drop(/) step. Then s’.status = 
(false, OK) which implies u’.status = false. But in order to be able to delete the element at the 
end of s.queue we have that s.queue # € and (last(s.queue)).flag = marked, so (s, uv’) satisfies 
Condition 2. 
Then assume that the element at the end of s.queue is not deleted but that u’.queue is 
changed to (false, OK) since s.status.flag = marked. Again we have w’.status = false, and since 
s.status.flag = marked, we have that (s,u’) satisfies Condition 2. 
The last possibility is that status is not changed at all in the drop(/) step, but then obviously 
(s,u’) satisfies Condition 2 since (s’, u’) satisfies it. 

Condition 3: 
Assume u’.status = ? and s.queue # €. Since wu’.status = ? we must have s’.status.stat = ? 
and then from the definition of the drop(/) step we infer s.status = s’.status. 
Then the element at the end of s.queue is not deleted in the drop(I) step (i.e., maxidx(s.queue) ¢ 
I) since otherwise s’.status = (false, 0K). Thus, also s’.queue # €. Since f’ is a valid explana- 
tion from w’ to s’, Condition 3 gives us mazidz(s’.queue) € rng(f’), and since maxidz(s.queue) ¢ 
I we must have g(mavzidz(s’.queue)) = maxidz(s.queue) since otherwise g could not be bijec- 
tive and strictly increasing. All in all we get mazidx(s.queue) € rng(f), as required. 


This concludes the simulation proof. 
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We can now prove that Ap safely implements Ag. 


Theorem 7.12 (Ap safely implements As) 
Ap Es Ag 


Proof 


Directly by Lemma 7.11 and the soundness of image-finite backward simulations with respect 
to the safe implementation relation (Lemma 5.8). 


7.2.38 Correctness 


Before we can prove the main theorem of this chapter — that D is a correct implementation of 
5 — we need to prove some basic lemmas about S and D. In the remainder of this chapter we 
use the following abbreviations. 


SM {send_msg(m) |m € Msg} 
RM = {receive_msg(m) |m € Msg} 


From the safe I/O automata Ag and Ap we get the following lemmas. 


Lemma 7.13 
As —- O(O (status € Bool) 3(S5M)) 
Proof 


Immediate from the definition of Ag since any send_msg(m) step would change status to ?. 


Lemma 7.14 
1. Ap F O(O4(SM) = A(|quene*| < |quenel)) 


2. Ap —F O(RM) => |queue®| = |queue| — 1) 


Proof 


Immediate from the definition of Ap since only send_msg(m) steps can add elements to queue, 
and receive_msg(m) steps remove one element from queue. 


The following two lemmas prove properties of live executions of D. The lemmas deal with live 
executions where, from some point on, no send_msg(m) actions occur and neither the sender nor 
the receiver is in recovery phase. Then, in the first lemma, we prove that eventually elements will 
be removed from queue, which, in the second lemma, is used to prove that queue is eventually 
emptied. 

The proofs of the lemmas introduce the way we write structured proofs of temporal properties 
of our systems. The proof style is due to Lamport. The following description is taken from 


[AL92b]: 
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We use hierarchically structured proofs. The theorem to be proved is statement 
(0)1. The proof of statement (2)7 is either an ordinary paragraph-style proof or the 
sequence of statements (i+ 1)1, (i@+1)2,...and their proofs. ... . Within a proof, 
(k)l denotes the most recent statement with that number. A statement has the form 


ASSUME: Assump PROVE: Goal 
which is abbreviated to Goal if there is no assumption. The assertion Q.E.D. in 
statement number (7+1)é of the proof of statement (i)7 denotes the goal of statement 
(i)j. The statement 

CASE: Assump 
is an abbreviation for 


ASSUME: Assump PROVE: Q.E.D. 


Within the proof of statement (7)7, Assumption (7) denotes that statement’s assump- 
tion, and Assumption (7).k denotes the assumption’s k'" item. 


Lemma 7.15 


Ip — Vk: O(O(A(SM) A rec, = false A rec, = false) => 
((|queue| =k A k > 0) ~ |queue| < k)) 


Proof 


ASSUME: a € Lp 
Prove: a- Vk: O(O(7A(SM) A rec, = false A rec, = false) => 
((|queue] =k Ak > 0)~> |queue| < k)) 
(1)1. AssuME: k > 0 
Prove: a O(O(-(SM) A rec, = false A rec, = false) => 
((|queue] =k A k > 0) ~> |queue| < k)) 


(2)1. ASSUME: a, is an arbitrary suffix of a 
PROVE: a, F O(7(SM) A rec, = false \ rec, = false) => 
((|queue] =k Ak > 0) ~ |queue| < k) 
(3)1. ASSUME: a; —F O(7(SM) A rec, = false A rec, = false) 
PROVE: a, F (|queue] =k Ak > 0)~ |queue| << k 


(4)1. ay & On(SM) (|queue®| < | queue|) 

Proor: By Lemma 7.14 Part 1, Lemma 3.5 Part 1 and Rule Par. 
(4)2. a, E O(|queue®| < |queue]) 

Proor: By (4)1, Assumption (3), and Rule MP. 
(4)3. ay F O((|queue| = k A k > 0) => (|queue| = k W |queue| < k)) 


Proor: By (4)2. 
(4)4. a EF WF(RM, rec, = false A rec, = false) 
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(2)2. Q.E.D. 
By (2)1 and Lemma 3.5 Part 2. 


(3)2. 


(1)2. Q.E.D. 
Proor: By (1)1 and Lemma 3.5 Part 5. 


Lemma 7.16 


Lp 


Proof 


( 


(4)5. aE 


(4)10. ay 
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Proor: By proof assumption (a € Lp) and definition of Qp, which 
induces Lp. 


On(rec, = false \ rec, = false \ |queue| > 0) V 


©(RM) 


Proor: By (4)4, the definition of WF’, and noting that enabled( RM) = 
(|queue| > 0). 


On(rec, = false \ rec, = false A |queue| > 0) V 


PROO 


F: By rewriting (4)7. 
(|queue| > 0) = O(RM) 


PROO 


F: By Assumption (3), (4)8, and Rule MP. 


F (|queue| = k A (RM))~* |queue| < k 


Proor: Implied by Lemma 7.14 Part 2. 
(4)11. Q.E.D. 
Proor: By (4)3, (4)9, (4)10, and Rule Pro2. 


Q.E.D. 


Proor: By (3)1 and the definition of implication. 


(A(S5M) A rec, = false A rec, = false) OU(queue = €)) 


ASSUME: a € Lp 


PROVE: 


a 


( 


(A(SM) A 


rec, = false \ rec, = false) OU( queue = €)) 


(1)1. ASSUME: a, is an arbitrary suffix of a 
PROVE: 


(2)1. ASSUME: ay — 
PROVE: a, FE 


ay 


(queue = €) 


(3)1. ay EVE: (( 


queue| =k A k > 0)~ |queue| < k) 


©(RM) 


a 

Proor: By (4)5, Lemma 3.5 Part 1, and definition of disjunction. 
a, F O-(rec, = false \ rec, = false \ |queue| > 0) V O( RM) 
Proor: By (4)6, Rule Par, and the definition of disjunction. 
(rec, = false \ rec, = false A |queue| > 0) => O(RM) 


FE O(7(SM) A rec, = false \ rec, = false) OU( queue = €) 
(A(SM) A rec, = false A rec, = false) 
© 
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Proor: By Lemma 7.15, Lemma 3.5 Parts 1, 5, and 6, and Rules Par and 


MP. 
(3)2. ay E Vk i(k > 0 => Sh’ (kh! < kb A (|queue| = k ~ |queue| = k’))) 
Proor: By (3)1 and Lemma 3.5 Part 7. 
(3)3. a, FE O(|queue| = 0) 
Proor: By (3)2 and Rule Prol. 
(3)4. ay — O7(S'M) (|queue®| < |queue|) 
Proor: By Lemma 7.14 Part 1, Lemma 3.5 Part 1 and Rule Par. 
(3)5. ay F O(|queue®| < |queue|) 
Proor: By (3)4, Assumption (2), and Rule MP. 
(3)6. a, F Vk : O(|queue| = k => (|queue| = k W |queue| < k)) 
Proor: By (3)5. 
(3)7. ay F O(|queue| = 0 => (|queue| = 0 W |queue| < 0)) 
Proor: By (3)6 and Lemma 3.5 Part 6. 
(3)8. a, — O(|queue| = 0 (|queue| = 0)) 
Proor: By (3)7, the fact that |quewe| < 0 is always false, and the definition 
of O. 


(3)9. ay FE OO(|queue| = 0) 
Proor: By (3)3, (3)8, and Rule MP1. 


(3)10. Q.E.D. 
Proof: Directly by (3)9. 
(2)2. Q.E.D. 
Proor: By (2)1 and definition of implication. 
(1)2. Q.E.D. 


Proor: By (1)1 and Lemma 3.5 Part 2. 
| 


An important advantage of this way of writing structured proofs of temporal properties is that 
at a first reading, one can concentrate on the first outermost levels of the proof. Once that has 
been understood, the details at lower levels can be considered. 

The next lemma contains the main part of the proof that D correctly implements 5. It 
states that for any Bps-related executions of Ap and Ag, if the execution of Ap satisfies Qp (the 
temporal formula which induces the liveness condition Lp), then the execution of Ag satisfies 
Qs (the temporal formula which induces the liveness condition Ls). The proof will be a proof 
by cases based on a proof by contradiction: if we assume the execution of Ag is not live, this 
means that the execution does not satisfy one of the weak fairness formulas in the definition of 
Qs. By considering the weak fairness formulas one by one and deriving a contradiction in each 
case, the result follows. 
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Lemma 7.17 


Let a € exec(Ap) and a’ € exec( Ag) be arbitrary executions of Ap and Ag, respectively, with 
(a,a’) € Bpg. AssumeaE Qp. Thena’ E Qs. 


Proof 
We prove the conjecture by contradiction. Thus, 


ASSUME: a’ IF Qs 
PRovE: False 


(1)l. a —& AWF(Cs1, rec, = false A rec, = false) V 
AWF(Cs 9, rec, = false \ rec, = false) V 
AWF(Cs3) V 
aAWF(Cs 4) 


Proor: Immediate by the Assumption, definition of Qs, and the Boolean operators. 
(1)2. Case: a! EF AWF (C1, rec, = false A rec, = false) 
(2)1. af & OOA(Cs 1) A OO(rec, = false \ rec, = false \ status € Bool) 


Proor: From Case Hypothesis (1) by expanding WF and noting the fact that 
enabled 4.(C's1) = (status € Bool). 


(2)2. a! — OOA(Cs 1) AOOA(SM) A OO (rec, = false A rec, = false \ status € Bool) 
Proor: By (2)1, Lemma 7.13, and MP1. 
(2)3. a F OO7(Cs1) A COA(SM) A OO (rec, = false \ rec, = false) 


Proor: By Lemmas 5.10 and 5.11 since C's; consists of external actions and Defini- 
tion 7.5 of Bps implies that for all (s,w) € Bps, if wu E (rec, = false \ rec, = false) 
then s — (rec, = false A rec, = false). 


a - OO7(Cs1) A OU(rec, = false A rec, = false \ queue = €) 

Proor: By (2)3, Lemma 7.16, and MP1. 

(2)5. a F OOA(Cs 1) A OO (rec, = false A rec, = false A status € Bool) 
Proor: By (2)4 and Invariant 7.1 Part 1. 

(2)6. a F AWF (Cpa, rec, = false A rec, = false) 


Proor: By (2)5, the definition of WF’, the fact that Cs, = Cp, and the fact that 
enabled 4,,(Cp,1) = (status € Bool). 


(2)7. Q.E.D. 
Proor: By (2)6, the assumption that a E Qp, and the definition of Qp. 
(1)3. Case: a! EF AWF (Cg 5, rec, = false A rec, = false) 
(2)1. a! & (OOA(Cg 5) A OO(rec, = false A rec, = false \ queue # €)) 
Proor: By expanding WF and noting that enabled 4.(Cs2) = (queue # €). 
(2)2. a F OO7(Cs 2) A OO(rec, = false A rec, = false \ queue # €) 


Proor: By Lemmas 5.10 and 5.11 since C's» consists of external actions and Defini- 
tion 7.5 of Bps and Lemma 7.3 imply that for all (s,u) € Bps, if wu E (rec, = false A 
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rec, = false \ queue # €) then s — (rec, = false A rec, = false \ queue # €). 
(2)3. a EF AWF (Cp 9, rec, = rec, = false) 


Proor: By (2)2, the definition of WF, the fact that Cs. = Cp and the fact that 
enabled 4,,(Cp,2) = (queue # €). 


(2)4. Q.E.D. 
Proor: By (2)3, the assumption that a E Qp, and the definition of Qp. 
(1)4. Case: a’ EF AWF (Cs 3) 
(2)1. Q.E.D. 
Proof: Similar to Case (1)3. 
(1)5. Case: a’ EF AWF (Cs 4) 


(2)1. Q.E.D. 
Proof: Similar to Case (1)3. 
(1)6. Q.E.D. 


Proor: By (1)1 and the exhaustive cases (1)2—(1)5. 


Finally, we can prove that D correctly implements 5. 


Theorem 7.18 
DOC,S$ 


Proof 
Immediate from Lemmas 7.11, 7.17, and 5.9. 


The total proof of correctness of D has been partitioned into three parts. First, some invariants 
were proved. Then, a relation was defined and proved to be an image-finite backward simulation 
from Ap to Ag. Note, that it is usually during the simulation proof that one realizes which 
invariants are needed. Thus, when performing the proof there is usually not this clear distinction 
between defining invariants and proving the simulation result, but for presentation purposes, we 
make the split. 

The third and final part of the proof is the liveness proof which, in conjunction with the 
simulation proof, allows us to conclude correctness. In the proofs at lower levels of abstraction, 
the same partition into three parts is found. 


The Generic Protocol G is defined and proved correct in the next chapter. 


Chapter 8 


The Generic Protocol G 


We can now start to introduce a more distributed view of the system. Both low-level protocols 
H and C consist of several parallel components: a sender, a receiver, two channels connecting 
the sender and receiver, and, for C, a clock subsystem. The G level consists of three parallel 
processes: a sender/receiver process and two channels. This is depicted in Figure 8.1. The 
sender/receiver process of G can intuitively be viewed as “partly” distributed. It contains state 
variables which are intuitively manipulated by a sender part of the sender/receiver process 
and state variables which are intuitively manipulated by a receiver part. However, some state 
variables are manipulated by both the sender part and the receiver part of the sender/receiver 
process. These “centralized” variables describe aspects which will be implemented differently 
by H (using handshakes) and C (using timing assumptions). The “distributed” variables, on the 
other hand, will basically reoccur in both H and C, and will be manipulated similarly in G, H, 
and C. 

Thus, we have developed G to be as distributed as possible according to H and C, and to 
contain an abstract handling of the crucial aspects of choosing good identifiers, where H and C 
use different methods. By looking a little bit forward at H and C, we can make the following 
more detailed introduction to G: 


As mentioned in Chapter 1, solutions to the at-most-once message delivery problem work by 
tagging each message with a unique identifier and sending it repeatedly over the channel. The 
receiver will only accept messages which are marked with “good” identifiers. 

Thus, the two protocols H and C both go through three major phases during normal opera- 
tion. 


Sender/Receiver G,/, 


receive_msg(m) 


ack(b) 


“Sender” “Receiver” 


recovers recovery 


d_pkt ve_pkt 
send_pkt,,.(p) Channel Ch, receive_pkt ,.(p) 
ctv php) wept) | _pkt d_pkt 
Figure 8.1 


The Generic Protocol G. 
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Choosing a message identifier The sender picks an identifier id that is within the set of 
identifiers that the receiver is willing to accept. In C time bounds are used to choose a 
good identifier; in H an initial handshake between the sender and the receiver is used. 


Sending the message and getting acknowledgement This phase is similar in both H and 
C. The sender (re)transmits the current message with the chosen id, until it receives an 
acknowledgement packet for that id. 


Cleaning up Here again, C uses time bounds (in particular timeouts) whereas H uses a hand- 
shake to determine when some “old” information may be discarded. 


Our Generic Protocol G is designed to capture these three phases in an abstract way that both H 
and C implement. The key abstractions incorporated into the protocol G are two “centralized” 
variables, good, and good,. The variable good, represents the identifiers that the sender might 
shortly assign to messages, and good, represents the identifiers that the receiver is willing to 
accept. Four actions of G deal with “growing” and “shrinking” good, and good,., respectively. 

The preconditions of the grow and shrink actions are designed to preserve certain key invari- 
ants. We actually allow more freedom in these actions than is actually needed by H and C. This 
leaves open the possibility that other low-level protocols, other than H and C, can be proved to 
be correct implementations of G. 


The rest of this chapter is organized as follows. Section 8.1 introduces the set of message 
identifiers. Section 8.2 then formally defines the channels in G. Then, in Section 8.3, we present 
the sender/receiver process, and in Section 8.4 we show how G is obtained from the subprocesses. 
Finally, in Section 8.5 we consider the proof that G correctly implements D. 


8.1 Message Identifiers 


In G and the lower level protocols we need a set of identifiers in order to label the messages 
communicated over the channels. In C the identifiers are timestamps ranging over the non- 
negative reals; in H the identifiers are just taken from some infinite set of elements. In G we 
use a set JD on which we place some constraints. When proving correct implementation for 
a lower-level protocol, JD is then instantiated with the set used at that lower level, and this 
set must satisfy the constraints on JD. Thus, G can be seen to be parameterized with ID. G 
correctly implements 5 for any proper value of ID; the low-level protocols correctly implement 
G for particular proper values of /D. The constraints on ID are: 


1. ID is infinite. 


2. nil ¢ ID. We need nil as a special value. 


8.2 The Channels 


As depicted in Figure 8.1, the G level contains two channels: a channel Ch,, intuitively for 
sending packets' from the sender part to the receiver part of the sender/receiver process, and a 
channel Ch,, in the other direction (for acknowledgements). 


‘Here and elsewhere, we use the term “packet” to denote objects sent over the channels; we reserve the term 
“message” for the “higher-level”, user-meaningful messages that appear, e.g., in the specification. 
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Below we specify the Ch,, channel as a live I/O automaton (Ach sr, Lcn,sr). The Ch,, = 
(Ach rs; Lcn,rs) Channel is similar and can be obtained from the definition of Ch,, by replacing 
the state variable sr by rs and actions send_pkt,,(p) and receive_pkt,,(p) by send_pkt,.,(p) and 
receive_pkt,.,(p). 


8.2.1 States and Start States 


Ch,, has only one state variable which contains the packets (including duplicates) currently in 
the channel. We let Ch,, be parameterized with a set P of possible packets. 


Tnivially 


sr B(P) The packets (including duplicates) in the 
channel. 


8.2.2 Actions 


The channel only has two types of actions: send_pkt,,(p), which represents the input of packet 
p from the environment, and receive_pkt,.(p) which represents the output of packet p from the 
channel. 


Input: 

send_pkt,,(p), p € P 
Output: 

receive_pkt.,.(p), p € P 
Internal: 

none 


8.2.3 Steps 


The channel is not reliable. This means that it may remove or duplicate packets. We have 
chosen to model this unreliability at the time of a send_pkt,.(p) step. 


send_pkt ,.(p) receive_pkt,.(p) 
Effect: Precondition: 
add a finite number of p to sr pe sr 
Effect: 


sr := sr \ {p} (* remove one copy *) 


In the specification, “a finite number” could mean 0. Note, that we could have modeled the 
unreliability of the channel by having internal lose and duplicate actions which could remove 
or duplicate packets at any time. However, such a channel can be shown to be equivalent to 
our channel, so by our substitutivity results, we will be able to substitute the channels for each 
other. 


8.2.4 Liveness 


The receive_pkt,,.(p) steps of Acy,., allow all received packets to be lost. With such a channel we 
cannot, of course, guarantee any liveness of the composed system, so we shall require that if we 
keep sending the same packet to the channel, then infinitely many will get through. Thus, if a 
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packet is sent infinitely often, then it is also received infinitely often. Furthermore we impose the 
natural requirement that if a packet has succeeded in being put into the channel, then eventually 
it will be delivered. 

Then the liveness condition Lcp,,, for the channel is induced by the following liveness formula: 


Bens & Wp:00(send-pkt,,(p)) > 30 (receive.pht,,(p)) A 
Vp: WF (receive_pkt,,.(p)) 


We do not prove formally that Qcn,.- is an environment-free liveness formula for Ach... However, 
we provide some intuition by informally describing an environment-free strategy (g, f) for Chg, 
(cf. Definitions 2.5 and 2.7): the g function of the strategy should on every input send_pkt,,(p) 
add one copy of p to sr. This means that when we are playing the game against the environment, 
whenever a send_pkt,,(p) input arrives, receive_pkt,,(p) will stay enabled at least until it is 
executed. 

The f function of the strategy, i.e., the function that determines the moves of the channel, 
should then work as follows: when the game commences after some finite execution, there 
are only finitely many packets in sr. The strategy can order these and use its first moves on 
outputting the packets. In the meantime send_pkt,,(p) actions occur. When the strategy has 
finished outputting initial packets it should start matching each send_pkt,,(p) action with a 
receive_pkt,.(p) action. Since f has access to the history of the game so far, it should simply at 
its first move after having output initial packets perform receive_pkt,,.(p:) if the first input action 
of the game was send_pkt,,(p1), and generally at its nth move perform receive_pkt,,.(pn) if the 
nth input action of the game was send_pkt,.(p,). Even though the environment may provide 
several (but only a finite number of) input actions at each move and, thus, might be “faster” 
than the channel, at any point in time the channel only has finitely many “unmatched” inputs 
which it will eventually have matched. The point is that the environment can never have sent 
infinitely many copies of the same packet without the channel having output infinitely many 
copies of the same packet, and all packets put into the channel will eventually be output. If f 
has matched all inputs, it should simply return the empty move L since in this case the channel 
is empty. 


Note that, by Proposition 3.4, Qcn,s, is stuttering-insensitive. 


8.3. The Sender/Receiver Process 


We specify the sender/receiver process as a live I/O automaton G,/, = (Aa,s/r, La,s/r): 


8.3.1 States and Start States 


As mentioned in the introduction to this chapter, Ag,./, intuitively consists of a sender part 
and a receiver part such that some state variables are only manipulated by the sender part, 
some state variables are only manipulated by the receiver part, and some state variables are 
manipulated by both parts. Thus, the state variables of Ag../, are consequently grouped into 
the following three classes. (When we write “sender” below, we refer to the sender part of the 
sender/receiver process. Similarly for “receiver” .) 
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{idle, The mode of the sender. Mode idle indi- 

needid, cates that the sender is not in the process of 

send, rec} sending a packet over the channel, needid 
indicates that the sender is ready to choose 
an identifier for the current message, and 
send indicates that the sender is sending 
(repeatedly) the current packet (consisting 
of current message with identifier) over the 
channel. Mode rec denotes that the sender 
is in recovery phase. 


A list containing all identifiers assigned to 
messages in the past. These identifiers will 
never be used again. The list induces a par- 


tial order on identifiers (see below). 


current-msq, Msg U{nil} | nil When mode, € {needid, send}, this vari- 
able contains the “current” message, i.e., 
the message about to be or being sent. In 
the other modes current-msg, is not used 
and is set to nil. 

last, ID U {nil} Any value When mode, = send this variable contains 
the identifier chosen for the current mes- 
sage. In all other modes its value is not 
used. Due to requirements in low-level pro- 
tocols (where last, could, e.g., be a time- 
stamp), last, is allowed to assume arbitrary 
values when it is not used. 


Acknowledgement from the recelver. 
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{idle, revd, | i The mode of the receiver. Mode idle indi- 

ack, rec} cates that the receiver has delivered all re- 
ceived messages to the user, rcvd indicates 
that messages have been accepted but not 
yet delivered to the user, ack indicates that 
the receiver is sending positive acknowl- 
edgements for the last message accepted to 
the sender. Mode rec denotes that the re- 
ceiver is in recovery phase. 


oye The list of messages accepted by the re- 
ceiver but not yet delivered. 


a IDuU ov Contains the identifier of the last message 
accepted. When its value is not used, it is 
assigned the special value nil. 


issued, PUD) Any superset | Includes everything that was ever accept- 
of good, such | able by the receiver, i.e., in good,. Thus, 
that issued, is used to guarantee that “old” iden- 
|[D \ issued,| | tifiers do not show up in good, again, which 
= oc could otherwise lead to duplicate delivery. 


nack-buf ,. ID* € A list of identifiers for which a negative ac- 
knowledgement will be issued. 


good , PUD) Any set When mode, = needid this set contains all 
the identifiers that the sender might choose 
for the current message. In all other modes 
its value is not used. 


a PUP) Any set At any time this set contains the identifiers 
the receiver will accept from the channel. 


current- on a false If current-ok = true the identifier chosen 
for the current message is considered good 
by the receiver, but the current message has 
not been accepted by the receiver yet. 


8.3.2 Partial Order of Identifiers 


In the G protocol we need an ordering of all the identifiers used as ids on messages sent by 
the sender. As we shall see below, an identifier id is chosen in a choose_id(id) step, so if a 
choose_id(id) step has occurred before a choose_id(id’) step, we will require that id is less than 
id’ in this ordering. Since we collect—as we shall see—all the ids used by the sender in used,, 
we use the following partial order derived from the state of G: 


If used, contains distinct elements and id precedes id’ in used,, then id <,, id’ 


In arbitrary states of G the same identifier might occur several times in used,; however, below 
we shall prove an invariant (Invariant 8.2 Part 2 on Page 125), which states that the elements 
of used, are all distinct, which then implies that all identifiers ever used by the sender during 
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execution are related by <,. Since identifiers of JD can be tested for equivalence (=), the 
definition of <,, trivially extends to <,. 


8.3.3 Actions 


Input: 
send_msg(m), m € Msg 
receive _pkt,.(m, id), m € Msg, id € ID 
receive_pkt, (id, 6), id € ID, b € Bool 
crash. 
crash, 

Output: 
receive_msg(m), m € Msg 
ack(), 6 € Bool 
send_pkt,,(m, id), m € Msg, id € ID 
send_pkt,,(id,b), id € ID, b € Bool 
TECOVER g 
TECOVEry 

Internal: 
prepare 
choose_id(id), id € ID 
shrink_good ,(tds), ids C ID 
shrink_good,.(tds), ids C ID 
grow_good (ids), ids C ID 
grow_good, (ids), ids C ID 
cleanup, 


8.3.4 Steps 


Before we formally define steps(Ag,./,) we provide some intuition. During normal operation the 
sender goes through the cycle idle-needid-send—idle of modes. When the sender is in mode 
idle and buf, is non-empty, a prepare step moves to mode needid and makes the message at 
the head of buf, the current message. Now “good” identifiers must be put into good,. Exactly 
how this is done will be discussed below. An identifier 7d for the current message is chosen from 
good, in a choose_id(id) step. In such a step the sender enters send mode in which it repeatedly 
sends the current message m with associated current identifier id in send_pkt,,.(m, id) steps. 
The sender will stay in this mode until it receives a positive (b = true) or negative (b = false) 
acknowledgement receive_pkt,.,(id,b) for the current identifier. In this case the sender moves to 
mode idle again from where acknowledgements ack(b) can be issued to the user (but only of 
buf , is empty since otherwise the sender is not acknowledging the last message sent, as required). 
If the receiver receives a packet (m, id) in a receive_pkt,.(m, id) step, it checks to see whether 
id is in good,. If this is the case it accepts” the message m, adds it to the end of buf,, and enters 
mode rcvd (if it was not there already). Mode rcevd indicates that the receiver has messages in 
buf, and is in the process of delivering these messages to the user. Once the last message in buf, 
has been delivered in a receive_msg(m) step, the receiver enters ack mode in which it will issue 
positive acknowledgements in send_pkt,.,(id, true) steps for the identifier id of the last message 
accepted from the sender (and thus the last message delivered to the user). These positive 
acknowledgements will be issued repeatedly to overcome the unreliability of the channel. 


2 We say that a packet (or the associated message) is “successfully received” or “accepted” when the associated 
identifier is in good, at the time of receipt. 
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The above discussion has focused on the normal modes of operation of the sender and receiver, 
where no crashes have occurred. After the formal definition of steps(Ag./,), we explain what 
can happen when sender or receiver crashes occur. 


We now look at the manipulation of the good sets. When a prepare step is performed, the good, 
set is emptied. The sender is now in needid mode, waiting to perform a choose_id(id) step. 
Since id must be taken from good,, this set must be “grown” with identifiers. Two types of 
steps can change good,: shrink_good,(ids) removes identifiers from good, and grow_good, (ids) 
adds identifiers to good,. When the receiver has not been in recovery phase “recently”, i.e., 
after the prepare step was performed, the sender and receiver should be in agreement about 
which identifiers are considered good. This situation is indicated by the special flag current-ok 
being true. In this situation grow_good,(ids) can only add elements from good, to good,, and 
the shrink_good,(ids) steps, which can remove elements from good,, must not remove elements 
which are already in good,. In this way we preserve the key invariant that if current-ok = true, 
then good, © good,, and, thus, the current packet is guaranteed to be accepted by the receiver 
(unless new crashes occur). A detail is that identifiers put into good, might immediately be 
“shrunk” away by a shrink_good (ids) step that empties good,. (If we look forward at C, only 
the value of the local sender clock is considered a good identifier. Thus, whenever the clock 
ticks, this corresponds, in G, to the old clock value being removed from good,, and the new 
value being added to good,.) When we deal with liveness below, we show how to guarantee that 
the sender will not grow and shrink good, forever but will eventually choose an identifier in a 
choose_id(id) step. 

If crashes occur, the low-level implementations H and C have no way of keeping good, a 
subset of good,. This must at the G level be reflected in the grow and shrink steps. We have 
designed these steps such that they preserve certain key invariants presented below. The steps 
actually allow more freedom than is needed by the implementations H and C, but in this way 
we have the possibility that other low-level implementations implement G. If, for instance, 
current-ok = false, it turns out to be necessary to allow shrink_good, to remove elements from 
good, which are already in good,. If, furthermore, mode, = rec, good, can be grown fairly 
arbitrarily. It is in this situation possible to add elements to good, which have never been issued 
by the receiver. This may give rise to a situation where the current identifier is not in good, 
when the current packet is sent, but is added to good, during transmission over the channel. 
(For this reason we shall, in the proofs below, introduce a derived variable good-ids containing 
identifiers from good, and identifiers not issued yet. Packets with identifiers in good-ids have a 
chance of being accepted by the receiver.) 

Other preconditions on the grow and shrink steps deal with guaranteeing that the sender 
and receiver do not reuse identifiers in their good sets. In particuler, the set issued,, which 
“survives” a crash (and thus has to be implemented in stable storage in the implementations), 
contains all identifiers that were ever in good,. No identifiers in issued, can ever be put in good... 
In this way it is guaranteed that the receiver will never—not even in the case of crashes—accept 
the same packet twice. Similarly, the sender will never choose an identifier which is in used,. 


We now define steps(Ag,./,). To increase readability we keep the definition of the steps of 
the sender in the left column and the definition of the steps of the receiver in the right column. 
Furthermore, we align the definition of the send-pkt steps with the definition of the corresponding 
receiver-pkt steps. 
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send_msg(m) 
Effect: 
if mode, # rec then 
buf , := buf ,°m 


prepare 
Precondition: 
mode, = idle A buf, #e 
Effect: 
modes, := needid 
good, :=9 
current-msg, := head(buf ,) 
buf , := tail( buf .) 
if mode, # rec then 
current-ok := true 


choose_id(id) 


Precondition: 

mode, = needid A td € good, 
Effect: 

modes := send 

last, := id 


used, := used, id 


send_pkt,, (m, id) 
Precondition: 
mode; = send A last. = id A 
current-msg, =™ 
Effect: 


none 


receive_pkt ,,.(m, id) 
Effect: 
if mode, # rec then 
if id € good, then 
mode, := rcvd 
buf, := buf,” m 
last, := id 
good, := good, \ {id' | id’ <., id} 
if td = last; A mode, = send then 
current-ok := false 
else if td A last, then 
if mode, = send A id = id, then 
nack-buf, := nack-buf,.° id 
else 
optionally nack-buf, := nack-buf 
else if mode, = idle then 
mode, := ack 


receive_msg(m) 
Precondition: 
mode, = rcvd A buf, #e A head(buf,.) 
Effect: 
buf, = tail(buf, ) 
if buf, = then 
mode, := ack 
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receive_pkt,., (id, b) 


Effect: 
if mode, = send A last, = id then 
modes := idle 
current-ack, := 6 
last, := arbitrary value 


current-msg, := nil 


ack(b) 
Precondition: 
mode, = idle A buf, =e A 
current-ack, = b 
Effect: 


none 


crashs 
Effect: 
mode, = rec 
current-ok := false 


rECOVET g 

Precondition: 
mode, = rec 

Effect: 
modes := idle 
last, := arbitrary value 
buf, = € 
current-msg, := nil 
current-acks := false 


grow _good , (ids) 
Precondition: 
mode, # needid V 


((mode, # rec => ids C issued,) A 
(current-ok = true => ids C good,.) A 


(ids M used. = @)) 
Effect: 
good, := good, U ids 


shrink_good ,(ids) 
Precondition: 
none 
Effect: 
good, := good, \ ids 


send _pkt,, (td, true) 
Precondition: 
mode, = ack A last, = id 
Effect: 
optionally mode, := idle 


send_pkt,, (id, false) 
Precondition: 
mode, # rec A nack-buf, Fe A 
head(nack-buf,.) = id 
Effect: 
nack-buf,, := tail(nack-buf, ) 


crash, 
Effect: 
mode, = rec 
current-ok := false 


recovery 
Precondition: 
mode, = rec 
Effect: 
mode, := idle 
last, := nil 
buf, :=€ 
nack-buf, := € 
issued, := any superset of 
issued, U used, U good, 
such that afterwards 


|ID \ issued,| = 00 


grow _good, (ids) 
Precondition: 
tds N issued, =O A 
|LD \ (éds U issued, )| = co 
Effect: 
good, := good, U ids 
issued, := issued, U ids 


shrink_good,.(ids) 
Precondition: 
current-ok = false V 


((mode;, = needid => ids good, = ) A 


(mode; = send => last; ¢ ids)) 
Effect: 
good, := good, \ ids 


cleanup, 
Precondition: 
mode, € {idle, ack} A 
(mode; = send last, 4 last,) 
Effect: 
mode, := idle 


last, := nil 
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Note that most locally-controlled steps of the sender and receiver are conditioned by mode, and 
mode, respectively, not being rec. Also, inputs (except crash, and crash,) do not lead to state 
changes when the side at which they occur is crashed. Thus, G is “dead” when it is crashed. 
Furthermore, crashes and subsequent recoveries have the effect of resetting all state variables 
(except issued, and used,) at the side at which they occur. For instance, even if the sender is 
about to issue a positive acknowledgement to the user when a sender crash occurs, the sender 
has forgotten about this when it recovers. These choices about the way G behaves with respect 
to crashes are motivated by the low-level protocols H and C. 


We now discuss certain special situations that can arise mainly due to crashes or recoveries. 
Assume that the sender is in send mode with (mj, id,) as the current packet. If a crash, occurs, 
the sender forgets, among other things, everything about (m,, id,). However, before it crashed, 
the sender might have succeeded in placing (m,,id,) in the channel. Since we do not assume 
any time bounds on channel delays, (m,, id,) might travel very slowly on the channel. In the 
meantime the sender recovers, receives a new message my in a send_msg(mz) step, assigns the 
identifier id2 to mz, and starts sending (mz, idz) to the channel. Now both (mj, id,) and (mz, ida) 
are traveling on the channel, and both id, and id, might be in good,. (The receiver has no way of 
knowing that the sender has been crashed.) In general, if crashes have occurred, several packets 
(7m, td,),..., (mx, idy) with identifiers in good, might be traveling on the channel. This gives 
rise to a race condition between the packets. Assume (m,, id;) is the first packet that reaches 
the receiver and gets accepted. Then the receiver is not allowed subsequently to accept any of 
the packets (my, id,),..., (mi, id;) since then either the receiver would accept the same message 
twice or it would reorder messages (since m1,...,™;-1; were sent before m;). The messages 
mM1,...,™m;-, are thus effectively lost, but since they were in the system during crashes, this 
is allowed by the Delayed-Decision Specification D (and consequently by the specification S). 
This explains the manipulation of good, in the definition of the receive_pkt,,.(m, id) steps. If the 
sender crashes in needid mode, the same kind of race condition does not arise since the current 
packet has not been placed in the channel yet. However, messages get lost but, again, this is 
allowed by D. 

If the receiver receives a packet (m,id) and id is not in good, it will not accept the packet. 
Now, two situations must be considered (which correspond to the two “else-if” cases in the 
definition of receive_pkt,.(m, id) above). 


1. If id F last,, we are not just receiving another copy of the last packet accepted. 


e if mode, = send and id = last,, we are, due to crashes, in a situation where the 
sender is in send mode with a “bad” identifier. The receiver must inform the sender 
about this situation since otherwise the sender would be stuck forever. Thus, the 
receiver adds id to nack-buf, which will lead to a send_pkt,,(id, false) step. Note, 
that since only one send_pkt,,(id, false) will be performed, there is no guarantee that 
the packet will actually be put into the channel (which is unreliable). However, the 
sender continues to send (m,id), so packets will continue to get through (due to 
channel liveness) to the receiver. Every time this happens, the receiver will add id to 
nack-buf,., so (id, false) will continue to be issued. By channel liveness in the other 
direction the sender will eventually receive (id, false) and thereby be dislodged. 


if mode, # send or id # last,, the received packet (m, id) is not the current packet 
of the sender but instead some old packet from the channel. The low-level protocols 
we consider cannot always identify this situation—mainly because the receiver in a 
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distributed implementation does not have access to mode, and last,. The C protocol 
can in some situations make some safe guesses, but generally a low-level protocol has 
to assume the worst case and thus add id to nack-buf,.. The G protocol leaves this 
possibility open. 


2. If id = last,, we are receiving a new copy of the last packet accepted. In this situation 


mode, could be idle, in which case it should be changed to ack. The situation is explained 
as follows. 


Due to requirements in the low-level implementations, a send_pkt,,(id, true) step must 
have the possibility of changing mode, to idle, which disables further send_pkt,,( id, true) 
steps. Thus, due to the unreliability of the channels, we are not sure that (id, true) 
actually arrives to inform the sender that the current packet was successfully received. 
But the sender will then continue to send (m, id) packets, and the (inevitable) receipt of 
some of these by the receiver will lead to mode change to ack, which, in turn, leads to 
send_pkt,.,(id, true) steps. As above, channel liveness ensures that a receive_pht,,(id, true) 
step will eventually occur as required. 


Some of this discussion has dealt with liveness. We now turn to the formal definition of the 
liveness condition for G,/,. 


8.3.5 Liveness 


Let 


Casin =  {prepare, ack( true), ack( false), recover, } U 
{send_pkt,.(m, id) | m € Msg A id € ID} 

Ca s/r2 = {choose_id(id) | id € ID} 

Ca s/r3 2 {recover,}U 
{receive_msg(m) | m € Msg} U 
{send_pkt,.,(id, true) | id € ID} 

Casjra = {send_pkt,,(id, false) | id € ID} 


The liveness condition Lg ./, for Ag,s/; is now induced by the following temporal formula. 


Qa,s/r = WF (Ca s/rt) A 
(C( mode, = needid \ mode, # rec) = O(Ce s/r2)) A 
WF (Ca s/ra) A 
WF (Ca s/ra) 


The first, third, and fourth conjunct express normal weak fairness to some locally-controlled 
actions of the sender and receiver, respectively. 


The second conjunct looks more complicated but simply states that it is always the case 
that if the sender stays in mode needid and the receiver does not crash, then eventually a 
choose_id(id) step occurs. Thus, infinite growing and shrinking of the good sets are avoided. 
Note, that this kind of liveness condition is more high-level than, e.g., weak fairness, but it 
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exactly captures the intuitive requirement to the execution of the system, and the general model 
of live I/O automata allows such general liveness requirements. 


As for the liveness formula for the channel Ch,, above, we do not formally prove that Qg_./, is 
an environment-free liveness formula for Ag .;, but instead provide some intuition as to how an 
environment-free strategy (g, f) could be defined: on inputs, the g function can choose arbitrarily 
between nondeterministic choices. The f function should deal with the four conjuncts of Qe@_s/r 
in a round-robin fashion: if it dealt with the first conjunct last time, it should deal with the 
second conjunct now, and so on. If it is time to deal with one of the weak-fairness formulas, 
f simply performs some step from the appropriate set if possible. The second conjunct needs 
more attention. Here f should do the following if mode, = needid and mode, # rec, and do 
nothing otherwise: 


1. If good, £ 0, then perform a choose_id(id) step. 


2. Else, if good, # 0, perform a grow_good,(ids) step (with ids nonempty). Such a step is 
always possible when good, # 9. 


3. Else, perform a grow_good,(ids) step with ids nonempty. Such a step is always possible 
since it is true that there are always infinitely many unused identifiers left. 


If Part 3 was performed, then Part 2 will be performed next time the second conjunct of Qe s/, 
is dealt with. If Part 2 was performed, then Part 1 will be chosen next time. This is under 
the assumption that the sender stays in mode needid and the receiver does not crash in the 
meantime, but if this is not satisfied, then the second conjunct does not restrict the execution 
at all. 


Another thing to note is that, by Lemma 4.8 and Proposition 3.4, Qaq,./, is stuttering-insensitive. 


8.4 The Specification of G 


As depicted in Figure 8.1, G consists of the sender/receiver process and the two channels. So, 
first define G’ = (AG, Li.) to be the following live I/O automaton 


G' = G,/,||Chs,||Ch,s 


where the set P of possible packets of the channels is instantiated with the packets that G,/, 
can send and receive, i.e., packets of the form (m,id) and (id,6). Thus, G’ is the parallel 
composition of the sender/receiver process and the channels. Since Qa s/r; Qcnjsr, and Qen rs 
are all stuttering-insensitive, Proposition 4.4 implies that LG is induced by 


Qa = Qa,s/r A Qchysr A Qchyrs 


By Definition 2.2 the channel actions send_pkt ,.(m, id), receive_pkt,,.(m, id), send_pkt,,(id, 6), 
and receive_pkt,,(id,b) are output actions of G’. Thus, to get G = (Ag, La) we hide these 
actions. Let 


Ag =  {send_pkt,,(m,id)|m € Msg A id € ID}U 
{receive_pkt,,.(m, id) |m € Msg A id € ID} U 
{send_pkt,.,(id, b) | id € ID A b € Bool} U 
{receive_pkt,.,(id,b) | id € ID A 6 € Bool} 
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Then, define 
G = @’\Ae 
By Proposition 4.5, Lg is induced by Qa. 


We can now turn attention to proving that G correctly implements D. 


8.5 Correctness of G 


In this section we consider the proof that G = (Ac, Lg) correctly implements D = (Ap, Lp). 

This will be done in terms of a refinement mapping from Ag to Ap and a subsequent liveness 

proof. We perform the refinement proof in all detail, but only sketch the liveness proof. We 

refer to the formal liveness proof at the H level for a similar—but formal—liveness proof. 
First, we state some invariants of Ag. 


8.5.1 Invariants 


As mentioned in Chapter 7, during the process of performing a simulation proof, it usually 
becomes clear that certain invariants are needed: some situation in the proof is impossible to 
solve but it turns out that the state in which the situation occurs is not reachable. Thus, an 
invariant that avoids these “bad” states is found. In this section we present the invariants 
we need in the refinement mapping proof from Ag to Ap. The proofs of the invariants are 
deferred to Appendix C, where we furthermore consider the general way to prove invariants of 
safe (timed) I/O automata. 

In the invariants we use a derived variable good-ids defined as follows: in any state s of Ac, 


define 

s.good-ids = s.good, U s.issued, 
where s.issued, is the complement of s.issued, with respect to [D. A message assigned an id in 
s.good-ids might still be received successfully, i.e, accepted by the receiver. 


The first invariant has two parts which state simple properties of the state when the sender is 
in send mode. (Recall from Appendix A that last, € used, is shorthand notation for last, € 
elems(used,). Similar notation will be used below.) 


Invariant 8.1 


1. If mode, = send then last, € used, 


2. If mode, = send then last, # nil 
| 


When the sender is in needid mode, it can never choose among identifiers that have been used 
before (since such identifiers cannot be put into good, again). As a consequence used, contains 
distinct elements. 


Invariant 8.2 


1. If mode, = needid then used, N good, = 9 
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2. All elements of used, are distinct 


As expected a receiver mode of rcvd indicates that there are some messages in the receiver 
buffer which have not yet been delivered to the user. 


Invariant 8.3 
1. If mode, = rcvd then buf, 4 € 
| 


The following invariant is a key invariant. It states relationships between and properties of the 
different sets of identifiers in Ag. 

In this invariant and other invariants below, we use the following definition: define in any 
state s of Ag ids(sr) to be the set of id components of the packets in the sr channel. Formally, 
we have 


ids(sr) = {id |m € Msg A (m, id) € sr} 
Similarly, 
ids(rs) = {id |b € Bool A (id, b) € rs} 


Invariant 8.4 
1. issued, D good, if mode, = needid A mode, # rec 
2. issued, > good, 
3. issued, D used, if mode, # rec 
4. used, D ids(sr) U (if mode, = send then {last,} else Q) 
5. used, D nack-buf,. 
6. used, D ids(rs) 
7. last, € good-ids 


8. If last, # nil then last, € used, 
| 


The following invariant states the fact that for any two packets in sr (possibly including the 
current packet), if the packets have the same identifier, then the packets are equal (and thus 
represent two copies of the same packet). 


Invariant 8.5 
1. Let pkts = sr U (if mode, = send then {(current-msq,, last,)} else 0), and 
let (m, id) € pkts and (m’, id’) € pkts. Then 
If id = id’ then m = m! 


126 8. The Generic Protocol G 


The next invariant states properties of reachable states where current-ok = true. Recall that 
current-ok intuitively is a flag which is true whenever the sender is in the process of sending the 
next message (packet), the receiver has not been in recovery phase since the last prepare action, 
and the current packet has not been received yet. Thus, current-ok = true indicates that the 
sender and receiver are synchronized and in agreement about which identifiers to use. 


Invariant 8.6 


1. If current-ok = true then mode, € {needid, send} 

2. If current-ok = true then mode, # rec 

3. If current-ok = true \ mode, = send then last, # last, 

4. If current-ok = true \ mode, = send then (last,, b) ¢ rs 

5. If current-ok = true A mode, = needid then good, C good, 
6. If current-ok = true A mode, = send then last, € good, 


7. If current-ok = true A mode, = send then last, ¢ nack-buf, 


In certain situations current-ok is guaranteed to be false. For instance, if the sender is in send 
mode and the current packet has been accepted by the receiver (indicated by either last, = last, 
or the fact that an acknowledgement for last, is in rs). 


Invariant 8.7 


1. If mode, = send A last, = last, then current-ok = false 


2. If mode, = send A (last,,b) € rs then current-ok = false 
a 


We now state properties of the identifiers in sr. Part 1 states that each identifier in sr has 
been chosen before (or is equal to) the current identifier when mode, = send. This is expressed 
using the ordering <, induced by used,. Parts 2-4 state that if either (2) the current packet 
has been accepted by the receiver, (3) the receiver has sent positive acknowledgement for the 
current packet to rs, or (4) the sender has received the positive acknowledgement, then none of 
the identifiers in sr (possibly including the current identifier Jast,) can never become “good”, 
i.e., can never reappear in good,. (These invariants among other things guarantee that Ag can 
never reorder messages or accept the same packet twice.) 


Invariant 8.8 
1. If mode, = send A id € ids(sr) then last, >, id 
2. If mode, = send A last, = last, then ({last,} U ids(sr))M good-ids = 0) 


3. If mode, = send A (last,, true) € rs then ({last,} U ids(sr)) M good-ids = 0) 
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4. If mode, = idle A current-ack, = true then ids(sr)M good-ids = 0) 
| 


In certain situations buf, is guaranteed to be empty. Part 1 of the following invariant states 
that if mode, = idle then buf, is empty. This situation occurs if the receiver has just sent 
acknowledgement after having delivered the last message to the user, or if the receiver has just 
recovered. Parts 2—4 deal with the situation where the current message is being acknowledged 
over rs. Either (2) the receiver is sending positive acknowledgements for the last message 
received (and passed on to the user), (3) the receiver has succeeded in placing the positive 
acknowledgement in rs, or (4) the sender has already received the positive acknowledgement. 


Invariant 8.9 


1. If mode, = idle then buf, = ¢ 
2. If mode, = ack then buf, = ¢ 
3. If mode, = send A (last,, true) € rs then buf, = ¢ 


4. If mode, = idle A current-ack, = true then buf, = «. 


The following invariant states that identifiers for which the receiver will or has sent negative 
acknowledgements can never (again) be considered “good” by the receiver. 


Invariant 8.10 


1. nack-buf,, MN good-ids = 0 


2. ids(rs)  good-ids = 0) 
| 


Furthermore, the receiver can never issue negative acknowledgements for the current identifier 
if it has accepted the current packet (unless new crashes have occurred). 


Invariant 8.11 


1. If mode, = send A last, € nack-buf, then last, # last,. 


2. If mode, = send A (last,, false) € rs then last, # last,. 
| 


Our final invariant states that there are always “enough” (read: infinitely many) identifiers 
left that have not been issued. This is an important invariant since it ensures that a message 
to be sent can always be associated with an identifier. The invariant will not be used in the 
safety proof since not being able to choose an identifier does not violate any safety requirement. 
Instead the invariant is essential for the system to guarantee any liveness requirements. 
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Invariant 8.12 
1. [LD \ issued,.| = 00 
|| 


The conjunction of all invariants above (which is itself an invariant) will be referred to by Ig. 


8.5.2 Safety 


In this section we show the existence of a refinement mapping from Ag to Ap. However, first 
we need some preliminary definitions. 

Let s be any state of Aq which satisfies I<. Define the possible pairs in s in the following 
way 


s.pos-pairs = {(m,id) € s.sr | id € s.good-ids \ (s.mode, = send id # s.last,)} 


The pairs in s.pos-pairs represent the “old” packets in sr that still have a chance of being 
successfully received by the receiver. Note, that we do not count (s.current-msg,, s.last,) as a 
possible pair when s.mode, = send. Thus, the set of possible pairs in a state consists of packets 
for which the sender never stayed around to receive acknowledgement because of sender crashes. 
If no crashes have ever occurred the set is empty. 

We want to order the possible pairs of a state into a list reflecting the order in which the 
pairs were sent. For this reason we—for any state s of Ag which satisfies /~-—define a total order 
on the packets in s.sr based on the partial order on ids imposed by s.used, (see Section 8.3.2): 


(m', id’) <y (m", id") if id’ <y id" 


Invariant 8.4 Part 4 and Invariant 8.5 Part 1 imply that the order is indeed total on all packets 
in s.sr for reachable states s of Ac. 

Now, for any state s of Ag which satisfies J<, define the possible list, written s.pos-list, 
to be the list obtained by ordering the elements of s.pos-pairs according to the ordering just 
introduced. (The closer to the head of the list the smaller the value according to the ordering). 
Thus, s.pos-list is the list of those packets (excluding the current packet) that still might be 
successfully received, and is ordered according to the order in which the packets were sent, with 
older packets occurring towards the head of the list. For all states s of Ag not satisfying Jc, 
define s.pos-list to be «. 

Define the function messages to extract the list of messages from a list of packets of sr. 
Thus, if 1 = ((m,, id,),..., (mp, id, )) then messages(l) = (m,,..., mn). 


When the mode of the sender is either needid or send, the value of current-msg, is the message 
to be sent to the receiver. (This message has already been removed from buf ,). Now, the destiny 
of this message might be unknown if there has been a crash, because then the id that has been 
(or is to be) assigned to the message might not be in good-ids or it might be removed from 
good-ids before the message is received. The variable current-ok in Ag is precisely what we need 
to state this uncertainty. So, the flag (OK or marked) to be associated with the current message 
in the refinement mapping below is then derived from current-ok in state s in the following way: 


s.current-flag = (if s.current-ok then OK else marked) 
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We now define the current queue, i.e., the part of the queue at the D level that corresponds to 
the current message at the G level, as follows 


8.current-queue = if s.mode, = needid V (s.mode, = send A s.last, € s.good-ids ) 
then ((s.current-msg,, s.current-flag)) 
else é€ 


When the mode of the sender is send and last, € good-ids we denote by current pair the set 
containing the pair (current-msg,, last,). In all other states this set is empty. Thus 


s.current-pair = if s.mode, = send A s.last, € s.good-ids 
then {(s.current-msgq,, s.last,)} 


else () 


We define a function Rgp from states( Ag) to states( Ap). This function will in Lemma 8.14 be 
proved to be a refinement mapping from Ag to Ap with respect to [g and Ip. In the definition, 
when we write e.g. “buf, paired with 0K”, we mean the element of (Msg x Flag)* obtained from 
buf, by pairing every message with OK. 


Definition 8.13 (Refinement Mapping From A, to Ap) 
If s € states( Ag) then define Rgp(s) to be the state u € states( Ap) such that 


l. w.rec, = (s.mode, = rec) 
u.rec, = (s.mode, = rec) 
2. u.queue is the concatenation of 


e s.buf, paired with OK 
e messages(s.pos-list) paired with marked 
@ s.current-queue 


e s.buf, paired with OK 


3. u.status = 


(false, OK) if s.mode, = rec A 
else (?, OK) if s.buf, fe B 
else (?, s.current-flag) if s.mode, = needid C(i) 
(?, s.current-flag) if s.mode, = send A s.last, € s.good-ids C(ii) 
(?, OK) if s.mode, = send A s.last, = s.last, A s.buf, #¢ C(iil) 
(true, OK) if s.mode, = send A s.last, = s.last, A s.buf, =e C(iv) 
(true, marked) if s.mode, = send A s.last, # s.last, A 


(s.last,, true) € s.rs C(v) 


(false, OK) if s.mode, = send A s.last, ¢ s.good-ids \ 

s.last, # s.last, A 

(s.last,, true) € s.rs C(vi) 
(s.current-ack,,0K) if s.mode, = idle C(vii) 
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It is easy to see that the cases in Part 3 of the definition are exhaustive. However, the cases 
C(ii)-C(vi) are overlapping in some non-reachable states (where s.last, € s.good-ids A (s.last, = 
s.last, V (s.last,, true) € s.rs), cf. Invariants 8.4 Part 7 and 8.10 Part 2). Since we shall only 
be interested in the image of states satisfying the invariants, this is not a problem in practice. 
However, to make Rep a mapping from all states of Ag to states of Ap, we adopt the convention 
that in cases C(ii)-C(vi) the first case (from top to bottom) that is satisfied by a given state is 
chosen. 


The intuition behind Rap is as follows: When either the sender or receiver in Ag is in mode 
rec this, of course, corresponds to Ap having either rec, or rec, set to true, respectively. This 
is captured in Part 1. 


Part 2 associates flags with the messages between the sender and the receiver. The messages in 
buf , and buf, all get paired with the flag OK. That is because these messages are “safe” as long 
as no new crashes occur. If a crash occurs at, e.g., the sender side, then of course the elements 
in buf, will be deleted, but this corresponds in Ap to marking these elements and dropping 
them. So, the flag associated with a message (or the status below) should indicate the situation 
for that message (or status) here and now. 

The messages in pos-list are all paired with marked. As explained above, when pos-list 
was defined, all elements of pos-list are “old” packets that still might be successfully received. 
However, elements of pos-list lose this possibility (i.e., are removed from pos-list) if a packet with 
higher id is successfully received by the receiver (since otherwise Ac could rearrange messages). 
Thus, messages in pos-list might be lost without any crashes occurring. For this reason these 
messages are paired with marked in Rep. 

In current-queue the flag is current-flag. If the receiver has not been in rec mode (which 
in this situation implies current-ok = true) since the last prepare action, we know that the id 
assigned (or to be assigned) to the current message is in good, (cf. Invariant 8.6 Parts 5 and 
6). Unless crashes occur this will be the case until the current message is successfully received. 
(Note, that the successful receipt of a message from pos-list cannot cause the id of the current 
message to be removed from good, since all messages in pos-list have ids less that this id). So, in 
this situation current-flag = OK. On the other hand, if a crash has occurred the current message 
might still be successfully received but it could be lost. In this case current-flag = marked as 
required. 


Part 3 deals with the status. First, recall that in Ap status records the status of the last message 
sent to the system. 

Case A deals with the situation where the sender has crashed. In this situation the last 
message sent can only cause a negative acknowledgement to the user. Therefore status = 
(false, OK). 

In Case B, mode, # rec and buf, # «. Thus, the last element sent is, for now, sitting safely 
in buf ,. For this reason we have status = (?, 0K). 

C(i) and C(ii) describe to the situation where the last element sent is in current-queue. Here 
status = (?, current-flag), where current-flag = marked is there has been a crash so that it is 
permitted to “lose” status (i.e., change it to (false, OK)). 

In C(iii) the last message sent has been received by the receiver and is sitting safely in buf. 

In C(iv) this message has been passed on to the user and the receiver is in the process of 
sending positive acknowledgements to the sender. This is a sure positive status, thus, status = 
(true, OK). 
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Case C(v) then describes the situation where a positive acknowledgement has been sent by 
the receiver, but where the receiver subsequently has crashed. In this situation the positive 
acknowledgement might eventually be successfully received by the sender, but, since the sender 
keeps on sending its current packet until it receives an acknowledgement, the receiver might issue 
negative acknowledgements for the current message and these negative acknowledgements could 
pass the positive acknowledgements in rs such that the sender receives a negative acknowledge- 
ment for the current message. The latter situation corresponds in Ap to status being lost. This 
explains why status = (true,marked) in case C(v). Note, that in the situation just explained, 
the current message has been successfully delivered to the user, but a subsequent crash could 
cause status to be lost anyway (recall that this is allowed by the specification). 

Case C(vi) actually describes two situations: (a) the id assigned to the current message is 
such that the current message can never be successfully received by the receiver. Thus, the 
receiver can only issue negative acknowledgements for this message. The other situation is: (b) 
the current message has been successfully received, but the receiver crashed before successfully 
placing a positive acknowledgement on the channel rs. Again, only negative acknowledgements 
can be received by the sender. This explains status = (false, OK). 

Finally, case C(vii) reflects the acknowledgement received by the sender for the (last) current 
message. 


After having used our knowledge and intuition about Ag and Ap to define Rep, we still need 
to verify that Rep is in fact a refinement mapping from Ag to Ap (with respect to Ig and Ip). 
The following lemma states that this is the case. 


Lemma 8.14 
Ag <R Ap vid Rap. 


Proof 


We prove that Rep is a refinement mapping from Ag to Ap with respect to Iq and Ip. We check 
the two conditions (which we call base case and inductive case, respectively) of Definition 5.2. 


Base Case 
It is easy to see that for any start state s of Ac, Rap(s) is a start state of Ap. 
Inductive Case 


Assume (s,a, 8’) € steps( Ag) such that s and s’ satisfy Ig and Rgp(s) satisfies fy (Invariant 7.1). 
Below we consider cases based on a (and sometimes subcases of each case) and for each (sub)case 
we define a finite execution fragment a of Ap of the form (Rep(s),a’,u",a",ul”,..., Rap(s’)) 
with trace(a) = trace(a). For brevity we let u denote Rgp(s) and wu’ denote Rgp(s’). 


Unless otherwise stated we let Part 1-3 refer to the three parts of Definition 8.13. 
a = send_msg(m) 
We consider cases based on s.mode,. 


1. s.mode, # rec 
Then, it is easy to see that (w, send_msg(m), uw’) is a step of Ap and thus a finite execution 
fragment with the right trace. 
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2. s.mode, = rec 
Then s’ = s, so also wu’ = u. 
We show that (uw, send_msg(m), uw”, mark(I), wu”, drop(1),u’), where wu”, uw”, and I are de- 
fined below, is a finite execution fragment of Ap by showing that (u, send_msg(m), wu’), 


(uw, mark(L), wu’), and (uw, drop(1), u’) are steps of Ap. Clearly the execution fragment 
has the right trace. 


Define wu’ .rec, = wU.rec, 
ul rec, = wu.rec, 
ul queue = w.queue ~(m, 0K) 
u status = (?,0K) 
Then obviously (u, send_msg(m), uw) € steps( Ap). 
Define wu’. rec, = u.rec, (= true) 
ul" TEC, = U.TEC, 
ul queue = u.queue”~ (m,marked) 
wu” status = wu" status 
Thus the only difference between u” and u’” is that the element at the end of queue is 


marked in w’”. Define J = {mazidx(u”.queue)}. Then, since u’.rec, = true, obviously 


(uw, mark(L), ul’) € steps( Ap). 
Finally, we have to show that (w’”, drop(1), u’) € steps(Ap). First note that drop is enabled 
in u’” since J contains the index of the last element of u’”.queue and this element is marked 
by explicit construction. It now suffices that the four state variables of Ap are handled 
correctly. 
rec, and rec,: 
We have (by construction and the fact that wu’ = u) w’”.rec, = u’.rec, and w’.rec, = 
u’.rec, as required by the definition of drop(/) in Ap. 
queue: 
We have (again by construction and the fact that wu’ = u) w’.queue = w’.queue ~ 
(m,marked). Thus, since drop(I) requires the last element of queue to be deleted, queue 
is handled correctly. 
status: 
Since the element at the end of queue is deleted, the definition of drop(/) requires that 
u’ status = (false, 0K), but this is the case since u.status = (false, OK) (from the definition 
of Rap) and uw’ = u. 


a = receive_msg(m) 


We show that (uw, receive_msg(m), u’) € steps(Ap). The step clearly has the right trace. 


From the precondition of the recetve_msg(m) steps in Aq we have that s.mode, = revd, 
s.buf, # ¢, and head(s.buf,) = m. The definition of Rep then implies that u.queue # € 
and head(u.queue) = (m,0K). Thus, from the definition of the receive_msg(m) steps in Ap we 
see that receive_msg(m) is enabled in w. It now suffices to show that the four state variables of 
Ap are handled correctly. 


rec,, rec,, and queue: 
It is easy to see that w’.rec, = u.rec,, u’.rec, = u.rec,, and w'.queue = tail(u.queue), as 


required by the definition of receive_msg(m) in Ap. 
status: 
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We consider cases based on which condition (A, B, C(i)-C(vii)) s satisfies in Part 3. 
Suppose s satisfies the condition in case A, C(v), C(vi), or C(vii). Then s’ satisfies the same 
condition, so u.status = u’.status. Since in all cases u.status.stat # ?, leaving status unchanged 
is permitted by the definition of recetve_msg(m) in Ap. 

Suppose s satisfies the condition in case B, C(i), or C(ii). Then s’ satisfies the same condition, 
so u.status = u’.status. In all three cases it is easy to see that u'.queue # © so it is allowed by 
the definition of receive_msg(m) in Ap to leave status unchanged. 

Suppose s satisfies the condition in case C(iii). If s’.buf,, 4 ¢ then s’ also satisfies this condition 
but in this case w’.queue # € so it is permitted by the definition of receitve_msg(m) in D to leave 
status unchanged. So, assume s’.buf, = ¢. Then s’ satisfies the condition in case C(iv). Thus, 
u.status = (?, 0K) and u’.status = (true, 0K). Also, s’.buf, = ¢ and Invariant 8.8 Part 2 implies 
that both s’.pos-list = ¢ and s’.current-queue = ¢. Then, since s’.buf,, = ¢, u.queue = ¢. Thus, 
changing status from (?,0K) to (true, OK) is as required by receive_msg(m) in Ap. 

Finally, the precondition of receive_msg(m) in Ac implies that s cannot satisfy the condition 
in case C(iv). 


a = ack(b) 


We show that (u, ack(b), u’) € steps( Ap). The step clearly has the right trace. 
By definition of ack(b) in Ag we have s’ = s so also wu’ = u. 


From the precondition of ack(b)in Ag we have s.mode, = idle, s.buf, = ¢, and s.current-ack, = 
b. Then u.status = (s.current-ack,,0K) = (b,0K) (by case C(vii) of Part 3). Thus, ack(b) is 
enabled in w. 


Since u.status.stat = OK, it is now easily seen that (u, ack(b), u’) is a step of D. 


a= crash, 


We show that (u, crash,, u”, mark(I), wu”, drop(I’), u’), where uw”, u’”, I, and I’ are defined below, 
is a finite execution fragment of Ap by showing that (w, crash,, wu”), (w", mark(L), uw’), and 
(uw, drop(I’), u’) are steps of Ap. Clearly the execution fragment has the right trace. 


Define u”.rec, = true 
au" rec, = wu.rec, 
uw” queue = wu.queue 
wu" status = u.status 


Then clearly (uw, crash,, wu’) € steps( Ap). 
First let i,, = |s.buf,| + |s.pos-list|. Then, define 
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wu” rec, = wi" rec, 
a” Tec, = uw" .rec, 
(uw. queue, 0,0) if s.mode, € {idle,rec} V 
(s.mode, = send A 
s.last, ¢ s.good-ids ) 
(q, {teg}, 9) if s.mode, = send A 
(wu queue, I, L') = s.last, € s.good-ids A 
(s.current-msg,, s.last,) € s.sr 
where g = mark(u".queue, {i-q}) 
(q, {tcqg}s {teqg}) otherwise 
where g = mark(u".queue, {i-q}) 
ul” status = (uw status.stat,marked) 


Since u’.rec, = true, clearly mark(I) is enabled in wu’. To prove that (u’,mark(I), wu”) € 
steps( Ap) it now suffices to show that all four state variables of Ap are handled correctly. 


rec, and rec,: 
Leaving rec, and rec, unchanged is as required by the definition of mark(/) in Ap. 
queue: 
By explicit construction of u’’.queue and J, it is easy to see that queue is handled correctly. 
Ap. 
status: 
Marking status is allowed by the definition of mark(/) in Ap. 
Thus, (w’, mark, u'”) € steps( Ap). 
Finally, we must show that (w’”, drop(I’), u’) € steps( Ap). Slearly drop(I’) is enabled in wu’, so 
it suffices to show that the four state variables of Ap are handled correctly. 


rec, and rec,: 


We have u’.rec, = true = wu'’.rec, and wu'.rec, = u.rec, = wu .rec,. Leaving rec, and rec, 
unchanged is as required by the definition of drop(/’) in Ap. 
status: 


We have w’.status = (false, OK) since s’.mode, = rec, and this is allowed by the definition of 
drop(I’) in Ap. 

queue: 
First, assume s.mode, € {idle,rec} or s.mode, = send A s.last, ¢ s.good-ids. Then it is 
easy to see that u’.queue = u'”.queue = u.queue. Leaving queue unchanged is as required by 
the definition of drop(I') in Ap since in this case I’ = 9). 
Next, assume (s.mode, = send A s.last, € s.good-ids A (s.current-msq,,s.last,) ¢ s.sr) or 
s.mode, = needid. Then we have s.current-queue = ((s.current-msg,, s.current-flag)) and 
s'.current-queue = €. But the other three (buf,, buf,, and pos-list) parts that make up the 
abstraction of a queue in Ap are unchanged. (Note, in the definition of u’”.queue is this case 
that the element in u.queue that corresponds to s.current-queue has index i,,). Then, it is 
easy to see that u’.queue = delete(u'” queue, {1-q}). Thus, by explicit construction of J’ and 
the definition of drop(/’) it is seen that queue is handled as required. 
Finally, assume (s.mode, = send A s.last, € s.good-ids \ (s.current-msq,,s.last,) € s.sr). 
Again, we have s.current-queue = ((s.current-msg,, s.current-flag)) and s’.current-queue = €. 
But in this case we have s’.pos-pairs = s'.pos-pairs U (s.current-msg,, 8.last,). Then Invari- 
ant 8.8 Part 1 implies that s’.pos-list = s.pos-list  (s.current-msg,, s.last,). We now have 
that the only difference between u’.queue and u.queue is that one of the elements (the one 
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corresponding to (s.current-msg,,s.last,)) in u’.queue is marked (which it might not be in 
u.queue). But this gives us u/.queue = u/” queue, and since I’ = @) in this case, it is seen that 
queue is handled as required by the definition of drop(I’) in Ap. 

Thus, (w’”, drop, u’) € steps( Ap) as required. 


a = crash, 


We show that (u, crash,, u”, mark(I),u’), where wu” and J are defined below, is a finite execution 
fragment of Ap by showing that (u, crash,,u”’) and (wu, mark(l),u’) are steps of Ap. Clearly 
the execution fragment has the right trace. 


Define wu’ .rec, = true 
wl ree, = UwU.TeC, 
u” queue = w.queue 
u status = u.status 
Clearly (w, crash,, uw) € steps( Ap). 
Define, 
Te {|s.buf,.| + |s.pos-list|} if s.mode, = needid V (s.mode, = send A s.last, € s.good-ids) 
en) otherwise 


We now show that (w”,mark(I),u’) € steps(Ap). First note that since u’.rec, = true, the 
definitions of [ and Rgp imply that mark(J) is enabled in wv”. It thus suffices to show that the 
four state variables of Ap are handled correctly. 


rec, and rec: 
We have u’.rec, = true = wu”.rec, and w'.rec, = u.rec, = u’.rec,. Leaving rec, and rec, 
unchanged is as required by the definition of mark(/) in Ap. 

queue and status: 
First assume s.mode, = needid or s.mode, = send A s.last, € s.good-ids. In this case the 
only difference in states s and s’ of the four components that make up the abstraction of a 
queue in Part 2 is that the element in current-queue is marked in s’ whereas it might be OK 
in s. So, the only difference between w’.queue(= u.queue) and u’.queue is that the element 
with index |s.buf,.| + |s.pos-list| has changed its flag to marked, but by definition of J in this 
case, this is as required by the definition of mark(/) in Ap. For status, if s.buf, # © then 
u.status = u'.status = (?,0K) by Part 3B. But leaving status unchanged is allowed by the 
definition of mark(I) in Ap. If s.buf, = ¢ then s satisfies either Part 3C(i) or 3C(ii) and s’ 
satisfies the same part. In this case status might change its flag from OK to marked but again 
this is allowed by the definition of mark(J) in Ap. 
Finally, in all other cases u.queue = u’.queue and u.status = u’.status so mark(I) should be a 
no-op, but again this is allowed by the definition of mark(I) in Ap since in this case [ = @). 


a = recover, 


We show that (u, mark(1), uv”, drop(1), wu”, recover,, u’), where uw”, u’”, and I are defined below, 
is a finite execution fragment of Ap by showing that (uw, mark(I), wu”), (w", drop(1), uw’), and 
(u’, recover,,u’) are steps of D. Clearly the execution fragment has the right trace. 


Define [ = {i | mavidr(u.queue) — (|s.buf,| — 1) <7 < mavidr(u.queue)}. 


Thus, J contains the indices of the last |s.buf,| elements in u.queue. 
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Define wu’’.rec, = U.TECs 
wu" PC, = U.Tec, 
u queue = mark(u.queue, 1) 
wu" status = wu.status 


Since s.mode, = rec we have u.rec, = true so the definition of J implies that mark(L) is enabled 
in u. Then it is easy to see that (u, mark(I), wu) € steps( Ap). 


Define wu’” rec, = w".rec, 
a” Tec, = w'.rec, 
ul queue = delete(u" queue, 1) 
u status = (false,0K) 


The definitions of J and u’.queue implies that drop(I) is enabled in wu”. Now, to show that 
(uw, drop(L), u’”) € steps( Ap), it suffices to show that the four state variables of Ap are handled 
correctly. 


rec, and rec,: 
Leaving rec, and rec, unchanged is as required by the definition of drop(I) in Ap. 
queue: 
By explicit construction of u’”.queue, clearly queue is handled correctly. 
status: 
Since drop(1) is always allowed to change status to (false, OK), status is handled correctly. 
Thus, (w”, drop(L), uw”) € steps( Ap). 


Finally, we prove that (w'”, recover,,u’) € steps(Ap). Since u’”.rec, = u’.rec, = u.rec, = true, 
we have that recover, is enabled in uw”. We show that the four state variables of Ap are handled 
correctly. 


rec, and rec,: 
Leaving rec, unchanged and changing rec, from true to false is as required by the definition 
of recover, in Ap. 

queue: 
Note that s.current-queue = s'.current-queue = ©, 8.pos-list = s'.pos-list, and s.buf, = 
s'.buf,. So, since buf, is emptied in the recover, step of Ag, the only difference between 
u.queue and wu’.queue is that the last |s.buf,| elements of u.queue are missing in u’.queue. 
Thus, u’.queue = u'’.queue as required by the definition of recover, in Ap. 

status: 
Since s’.mode, = idle, s’.buf, = ¢, and s’.current-ack, = false, we have u’.status = (false, OK) 
by Part 3(vii). Thus, u’.status = uw” status as required by the definition of recover, in Ap. 

Thus, (uw, recover,,u’) € steps( Ap). 


a = recover, 


We show that (uw, mark(1), uv”, drop(1), wu”, recover,, u’), where u”, uw”, and I are defined below, 
is a finite execution fragment of Ap by showing that (u,mark(1), wu"), (w”, drop(1), wu”), and 
(u'”’, recover,,u’) are steps of Ap. Clearly the execution fragment has the right trace. 


First, define u’.rec, = w.rec, 
wl Tec, = U.TEC, 
uw” rec, = w".rec, 


wu" rec, = w' rec, 
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Below we define J so that it contains indices of u.queue and indices of marked elements in 
u” queue. Then, since s.mode, = rec we have u.rec, = true, so mark(/) is enabled in u, drop(1) 
is enabled in w”, and finally recover, is enabled in u’” since we also have u’".rec, = true. 


We now show that the four state variables in Ap are handled correctly by all steps in the 
execution fragment. 


rec, and rec: 
As in the case a = recover, above it is easy to see that rec, and rec, are handled correctly. 


queue: 
Note that s’.good-ids C s.good-ids since issued, might be extended in the recover, step of 
Ag. This leads to the observations that (a) either s’.current-queue = s.current-queue or 


s'.current-queue = €, and (b) s’.pos-pairs C s.pos-pairs so that s’.pos-list can be obtained 
from s.pos-list by deleting some elements. Also we have s.buf, = s’.buf, and s’.buf, = ¢. 
Thus, u’.queue can be obtained from u.queue by deleting some elements. By letting J be the 
indices of these elements, the elements are marked in the mark(/) step and then deleted in 
the drop(I) step. Thus, queve is handled correctly. 

status: 
We consider cases based on which condition in Part 3 is satisfied by s. 
Suppose s satisfies condition A. Then so does s’ so we have u.status = u’.status = (false, 0K) 
which is allowed by the execution fragment of Ap. 
If s satisfies condition B, then so does s’ so we have u.status = u’.status = (?,0K). This is 
allowed by the execution fragment of Ap provided that the element at the end of u.queue was 
not deleted in the drop(I) step but this is the case (that it was not deleted) since s.buf, = 
s'.buf, #€. 
Also, if s satisfies C(i) then so does s’ (with s.current-flag = s'.current-flag), and this is 
allowed since s.buf, = s’.buf, = € and s.current-queue = s’.current-queue # © so the last 
element of u.queue was not deleted in the drop(J) step. 
If s satisfies C(ii) then s.last, = s’.last, ¢ ids(s.rs) = ids(s'.rs) (by Invariant 8.10 Part 2) 
and s.dast, # nil (by Invariant 8.1 Part 2). Now, if s’.last, € s’.good-ids then s’ satisfies 
C(ii) so s.current-queue = s'.current-queue # €. As for case C(i) we see that this is allowed. 
If s’.last, ¢ s'.good-ids then, since s’.last, = nil # s’.last,, s’ satisfies condition C(vi), so 
u’.status = (false, OK) which is allowed by the execution fragment. 
Now, suppose s satisfies C(iii). Then Invariant 8.4 Part 7 implies s.last, ¢ s.good-ids which 
again implies s’.last, ¢ s’.good-ids since s'.good-ids C s.good-ids. Invariant 8.9 Part 3 im- 
plies (s.last,, true) ¢ s.rs, i.e., (s'.last,, true) ¢ s’.rs. Thus, s’ satisfies condition C(vi), so 
u.status = (false, 0K) which is allowed by the execution fragment of Ap. 
If s satisfies C(iv) we consider two subcases. If (s.last,, true) ¢ s.rs the case is similar to case 
C(iii) above. So assume (s.last,, true) € s.rs. Then s’ satisfies C(v) so u.status = (true, OK) 
and u’.status = (true,marked). This marking of status is allowed by mark(I) in Ap. Then 
total change of status is allowed is the element at the end of w’.queue is not deleted in the 
drop(I) step. Invariant 8.8 Part 2 implies that s.current-queue = s.pos-list = € so u.queue = €, 
thus there is no last element to be deleted. That suffices. 
If s satisfies C(v), then so does s’ (Invariant 8.1 Part 2 implies s’.last, # nil = s’.last,). Thus, 
s.status = s'.status = (true,marked). This is allowed since u.queue = € (so the last element of 
the queue cannot be deleted in the drop(/) step). To see why w.queue = €, we have from C(v) 
that s.buf, = ¢ and Invariants 8.8 Part 3 and 8.9 Part 3 imply s.current-queue = s.pos-list = 
s.buf,, = ¢. That suffices. 
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If s satisfies condition C(vi) then so does s’ (arguments as above). Thus, u.status = u’.status = 
(false, 0K) which is allowed by the execution fragment. 

Finally, if s satisfies condition C(vii), then so does s’. We then have u.status = u’.status = 
(s.current-ack,,0K). This is easily seen to be allowed if s.current-ack, = false. So, assume 
s.current-ack, = true. Then having u.status = u’.status = (true, 0K) is allowed provided the 
element at the end of u.quewe is not deleted in the drop(/) step. A sufficient condition is 
to show u.queue = ¢. From C(vii) we have s.buf, = s.current-queue = ¢ and Invariants 8.8 
Part 4 and 8.9 Part 4 imply s.pos-list = s.buf, = ¢. Thus, u.queue = «. 


a = prepare 
We consider two cases 


e s.mode, = rec 
We show that (u, mark(I), u’) € steps( Ap), where I = |s.buf,.|+|s.pos-list|. This step (and 
execution fragment) clearly has the right trace (the empty trace). 
Since s.mode, = rec, we have u.rec, = true, so clearly mark(/) is enabled in w. 
We show that the four state variables of Ap are handled correctly. 


rec, and rec,: 
We have s.mode, = idle and s’.mode, = needid, so u.rec, = u’.rec, = false which is as 
required by the definition of mark(I) in Ap. From the case hypothesis and the definition 
of prepare in Ag, we have s.mode, = s’.mode, = rec, so u.rec, = u’.rec, = true which is 
also as required by the definition of mark(J). 

queue: 
Note that the element at the head of buf, is moved to current-msg, in the prepare step of 
Ag. From the definition of Rep we have that this element goes from being 0K when it was 
in buf, to being marked (s.mode, = rec implies, by Invariant 8.6 Part 2, s’.current-ok = 
false which in turn implies s’.current-flag = marked) when it is in current-queue. Neither 
buf,, nor pos-list are changed in the prepare step. Thus, u’.queue is the same as u.queue 
except that the message at position |s.buf,.| + |s.pos-list| is marked in u.queue and OK in 
u.queue. This is as required by the definition of mark(J) in Ap. 

status: 
We have u.status = (?,0K) since s.buf, # ¢ (from the precondition of the prepare step). 
Either state s’ satisfies Condition 3B in which case u’.status = (?,0K) or s’ satisfies 
condition C(i) in which case u’.status = (?, false). Both of these situations are allowed 
by the definition of mark(I) in Ap. 

Thus, (u, mark(L), u’) € steps( Ap). 


e s.mode, # rec 


Here we have s’.current-flag = OK from the effect of the prepare step, so with arguments 
similar to those used in the previous case it is easy to show show that u’ = u. Thus, the 
execution fragment consisting of only the state u has the right trace. That suffices. 


a = choose_id(id) 


We consider two cases 
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e s'.last, € s'.good-ids 


We show that (uw, drop(1),u’) € steps(Ap), where I = {|s.buf,,| + |s.pos-list|}. This step 

(and finite execution fragment) clearly has the right trace (the empty trace). 

We show that the four state variables of Ap are handled correctly. 

rec, and rec: 
We have s.mode, = needid, s’.mode, = send, and s.mode, = s’.mode, which implies 
u.rec, = u’.rec, and u.rec, = w'.rec, as required by the definition of drop(I) in Ap. 

queue: 
We note that s’.buf, = s.buf,, s’.pos-list = s.pos-list, and s'.buf, = s.buf,. However, 
s'.current-queue = € but s.current-queue # ©. Thus, u’.queue can be obtained from 
u’.queue by deleting the element that corresponds to s.current-queue. From the case 
hypothesis and the definition of choose_id(id) in Ag we have s.good, ¢ s.good-ids (note, 
s'.good-ids = s.good-ids). Now, since s.mode, = needid, Invariant 8.6 Part 5 implies 
s.current-ok = false which again implies s.current-flag = marked. Thus, the flag of 
the element s.current-queue is marked. Now, s.current-queue corresponds to position 
|s.buf,.| + |s.pos-list| in u.queue. Since this element is marked, drop(J) is enabled in uw. 
Furthermore, it is easy to see that queue is handled correctly. 

status: 
If s.buf, # ¢ then also s’.buf, # ¢ so both s and s’ satisfy condition 3B. Thus, u.status = 
u’.status = (?,0K). This is allowed by drop(/) since the element at the end of queue is not 
deleted because s.buf, = s’.buf, #¢. Now, if s.buf, =, s satisfies condition 3C(i), i-e., 
u.status = (?, false) since s’.current-flag = marked (see the discussion for queue above). 
We show that s’ satisfies 3C(vi) such that u’.status = (false,OK) which is allowed by 
drop(I). This amounts to showing s’.last, # s’.last, and (s’.last,, true) ¢ s'.rs since the 
case hypothesis and the definition of choose_id(id) give us the rest: 
From the definition of choose_id(id) we get id = s'.last, € s.good,. Invariant 8.2 Part 1 
then implies s’ last, ¢ s.used,. Also, s’.last, #4 nil by Invariant 8.1 Part 2. Invariant 8.4 
Part 8 implies (since s.last, = s'.last,) that s’.last, = nil or s’.last, € s.used,. Thus, 
we get s/.last, # s'.last, as required. Also, since s’.last, ¢ s.used,, Invariant 8.4 Part 6 
implies (s’.last,, true) ¢ s.rs = s’.rs as required. 

Thus, (uw, drop(1), u’) € steps( Ap). 

8’ last, € s’.good-ids 

We show u’ = u by comparing the four state variables of Ap in wu and wu’. The execution 

fragment u then has the right properties. 

rec, and rec: 
We have s.mode, = needid, s’.mode, = send, and s.mode, = s’.mode, which implies 
u.rec, = u'.rec, and u.rec, = u’.rec, as required. 

queue: 
Her we have s’.current-queue = s.current-queue. Then it is easy to see that u’.queue = 
UW. Queue. 

status: 
We have that either both s and s’ satisfy condition 3B, or s satisfies 3C(i) and s’ satisfies 
3C(ii). In both cases w’.status = u.status as required. 


a = send_pkt,,(m, id) 


We show u = u’ by comparing the four state variables of Ap in u and wu’. The execution fragment 


140 8. The Generic Protocol G 


u then has the right properties. 


rec, and rec,: 


We have s.mode, = s’.mode, and s.mode, = s'.mode, which implies u.rec, = u’.rec, and 
u.rec, = u'.rec, as required. 

queue: 
We have s’.buf, = s.buf,, s’.current-queue = s.current-queue and s’.buf, = s.buf,. The 


send_pkt ,(m, id) step might add some copies of (m, last,) to the channel sr. However, since 
mode, = send, this does not change the value of pos-pairs, so s'.pos-list = s.pos-list. Thus, 
u’. queue = u.queue. 

status: 
Whatever condition in Part 3 of Definition 8.13 s satisfies, s’ satisfies the same. This implies 
u’ status = u.status. 


a = receive_pkt,,.(m, id) 


Since this step may remove the last copy of (m, id) from the channel sr (a multiset), we generally 
have s’.pos-pairs C s.pos-pairs. (Note, that the ordering of pairs is unchanged since used, is 
unchanged). Also, we have s’.buf, = s.buf,. 


We consider cases. 


e s.mode, = rec 
In this case the only change in the step of Ag is the above mentioned change of the channel 
sr. We show (u, drop(I),u’) € steps( Ap), where I is defined below. This step (and finite 
execution fragment) clearly has the right trace (the empty trace). 
Define I = ) . if (m, id) ¢ s.pos-list V (m, id) € s'.pos-list 

{|s.buf,| +7} otherwise, where s.pos-list[?] = (m, id) 

Clearly drop(1) is enabled in uw (elements in pos-list correspond to marked elements in 

u.queue). We show that all four state variables of Ap are handled correctly. 

rec, and rec,: 
It is easy to see that we have w’.rec, = u.rec, and u’.rec, = u.rec,(= false) as required 
by the definition of drop(T) in Ap. 

queue: 
We have s’.current-queue = s.current-queue, s’.buf, = s.buf,, and s’.buf, = s.buf,. 
Then the definition of J implies that queue is handled as required by the definition of 
drop(I) in Ap. 

status: 
We have from Part 3 that u’.status = u.status since none of the variables occurring in 
Part 3 are changed in the step of Ag. This is allowed by drop(/) provided either the value 
of status is (false, OK) or the element at the end of queue was not deleted. For conditions 
A, B, C(i), C(ii), and C(vi) this is obvious. For C(ii) and C(iii) we get from Invariant 8.8 
Part 2 that pos-list = ¢, so u'.queue = u.queue which suffices. For C(iv) Invariant 8.8 
Part 3 implies in the same way that u’.queue = u.queue. Finally, for C(vii) only the case 
where current-ack, = true is of interest. But again we get u’.queue = u.queue. This time 
because of Invariant 8.8 Part 4. 


e s.mode, # rec 
We consider cases based on the if-statement in the definition of receive_pkt,,.(m, id) in 
Ag s/r° 
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— id € s.good,, 

This implies zd € s.good-ids. 

We show that (w, drop(1), wu, unmark(I’), u’), where uw’, I, and I’ are defined below, is 

a finite execution fragment of Ap. The execution fragment clearly has the right trace 

(the empty trace). 

rec, and rec: 
It is easy to see that we have w’.rec, = u.rec, and u’.rec, = u.rec,(# false). Define 
ul rec, = u.rec, and wu" .rec, = u.rec,. Leaving rec, and rec, unchanged is as required 
by the definitions of drop(I) and unmark(I’) in Ap. 

queue: 
Since id € s.good-ids we have that (m, id) € s.pos-pairs U s.current-pair where, by 
definition, s.pos-pairs and s.current-pair are disjoint (all ids are different). 
First, assume (m,id) € s.pos-pairs. The effect of receiving this pair is to remove 
from good, (and thus from good-ids) all ids less than or equal id. This corresponds 
to removing an initial prefix of s.pos-list up to and including (m, id). And at the same 
time m is moved to the end of buf,. Invariant 8.8 Part 1 and the fact that s.pos-pairs 
and s.current-pair are disjoint gives us s.current-queue = s'.current-queue. Thus, 
u’.queue can be obtained from u.queue by deleting some elements corresponding to 
the initial prefix of s.pos-lust and changing the flag of the element corresponding to 
(m, id) to OK (since now this element is in buf,.). Then clearly [ and I’ can be defined 
so that the change in queue is as required by the definition on drop(/) and drop(I’) 
in Ap 
If (m, id) € s.current-pair a similar argument gives us that u’.queue can be obtained 
from u.queue by deleting all elements corresponding to elements in s.pos-list and 
setting the flag of the element corresponding to s.current-queue to OK. In this case 
s’.current-queue = ¢. Again, J and I’ can be defined. 

status: 
If s satisfies condition A, B, or C(i) of Part 3 then so does s’. This is allowed by 
drop(I) and unmark(I’) since either u’.status = (false, 0K) or the element at the end 
of u.queue was not deleted. 
If s satisfies C(ii) then s’ satisfies either C(ii) or C(iii). In both cases the element 
end of w.queue was not deleted (as required) and the possible flag change of status 
to OK is allowed by unmark(I’). 
s cannot satisfy C(iii), C(iv), or C(v) since then Invariant 8.8 Parts 2 and 3 would 
imply that no packets in s.sr could be received successfully which contradicts the 
assumption that zd € s.good,. 
If s satisfies C(vi) then so does s’. This is allowed by drop(I) and unmark(I’') in Ap. 
Finally, assume s satisfies C(vii). Then s.current-ack, = false since we otherwise 
would have a contradiction with Invariant 8.8 Part 4. Thus, u’.status = u.status = 
(false, OK) which is allowed by drop(I) and unmark(I) in Ap. 

— id € s.good, 
Then (u, drop(I), u’) € steps( Ap). 
The proof is similar to the proof in case s.mode, = rec above. 


a = send_pkt,.,(id, b) 


Here it is easy to see that that u = u’. That suffices since then the execution fragment u of Ap 
has the right properties. 
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a = receive_pkt,.,(id, b) 
We consider cases 


e s.mode, = send A s.last, = id 


We show that (u, drop(), wu”, unmark(@), u’), where wu” is defined below, is a finite execution 
fragment of Ap. The execution fragment clearly has the right trace (the empty trace). 


Define w’’.rec, = wU.rec, 
au" rec, = Uw.rec, 
u" queue = u.queue, 


We will define u”.status below when we consider cases. 


First note that drop() and unmark(Q) are enabled in u and wv’, respectively, since these 
actions have no precondition. We show that all four state variables of Ap are handled 
correctly by the two steps in the execution fragment. 


rec, and rec,: 
We obviously have wu’.rec, = u.rec, = u”.rec, and wu'.rec, = u.rec, = wu’ .rec,. Leaving 
rec, and rec, unchanged is as required by the definitions of drop(Q) and unmark(Q) in 
Ap. 

queue: 
First observe that s’.buf, = s.buf, and s’.buf, = s.buf,. Since (s.last,,b) € s.rs, In- 
variant 8.10 Part 2 implies that s.last, ¢ s.good-ids thus s.current-queue = ¢. Also 
s'.current-queue = € since s’.mode, = idle. The receive_pkt,,(id,b) step in Ag might 
cause (s.current-msg,, 8.last,) to be added to pos-pairs (the pair might have been put onto 
sr but did not figure in s.pos-pairs because s.mode, = send). (s.current-msq,, s.last, ) 
is, however, not added to pos-pairs since s.last, ¢ s.good-ids as explained above. Thus, 
we have s’.pos-list = s.pos-list. All in all we have u'.quewe = u.queue. Leaving queue 
unchanged is as required by the definitions of drop(Q) and unmark(Q) in Ap. 

status: 
State s cannot satisfy conditions A, C(i), and C(vii) of Part 3 because s.mode, = send. 
If s satisfies condition B then so does s. By defining u”.status = u.status we have that 
status is unchanged in the execution fragment which is allowed by the definitions of 
drop(Q) and unmark(Q) in Ap. 
State s cannot satisfy condition C(ii) since s’.last, ¢ s.good-ids as explained above. 
Also, s cannot satisfy condition C(iii). If 6 = true then Invariant 8.9 Part 3 implies 
s.buf, = €¢ which contradicts condition C(iii). If 6 = false then Invariant 8.11 Part 2 
implies s.last, # s.last, which is also a contradiction. 
Assume s satisfies condition C(vi). Then u.status = (true, OK). From the discussion in 
the previous condition C(iii), we have b = true. Now, s.current-ack, = 6 = true and 
s'.mode, = idle so s’ satisfies condition C(vii). Thus, also u’.status = (true, OK). By 
defining w’.status = (true, OK) we have that status is unchanges in the execution fragment 
which is allowed by the definitions of drop(Q) and unmark(Q) in Ap. 
Next, assume s satisfies condition C(v). Then u.status = (true,marked). If b = true 
then by condition C(vii) we have w’.status = (true,OK). This is allowed by drop(() 
and unmark(Q) by defining u’’.status = u.status. If b = false then, again, by condition 
C(vii) w’.status = (false,0K) which is allowed by drop(Q) and unmark(Q) by defining 
u” status = u'.status. 
Finally, if s satisfies C(vi) then b must be false since the condition states (s.last,, true) ¢ 
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s.rs. Thus, u.status = (false,O0K) and by condition C(vii) also u’.status = (false, 0K). 
So, by defining w”.status = u.status, we leave status unchanged, which is allowed by the 
definition of drop(@) and unmark(@) in Ap. 


e s.mode, # send V s.last, # id 


Then the only difference between s’ and s is that s’ has one less copy of (m,id) in the 

channel rs. 

We show that u’ = u. Then the execution fragment u clearly has the right properties. We 

check the state variables of Ap. 

rec,, recy, and queue: 
Obviously rec,, rec,, and queue are the same in u and wi’. 

status: 
No matter which condition in Part 3 s satisfies, s’ satisfies the same condition, thus, 
u'.status = u.status. The only interesting case is if s satisfies condition C(v). The 
condition states that s.mode, = send, so the case hypothesis gives us that id # s.last,. 
Thus, (m,id) # (s.last,, true). Then, since (s.last,, true) € s.rs by condition C(v) we 
also have (s‘.last,, true) € s'.rs. Thus, also s’ satisfies condition C(v). 


a € {shrink_good (ids), grow_good ,( ids) } 


Changing good, clearly does not change anything in the mapping Rep. Thus, u’ = wu. Then the 
finite execution fragment wu clearly has the right properties. 


a = shrink_good,.(ids) 


This step removes elements from good,, thus, s’.good-ids C s.good-ids. 


We consider cases 


e s.current-ok = false 


We show (u, drop(I), u’) € steps( Ap), where I is defined below. Clearly the step (and finite 

execution fragment) has the right trace (the empty trace). 

rec, and rec: 
We clearly have u’.rec, = u.rec, and u’.rec, = u.rec, as required by the definition of 
drop(I) in Ap. 

queue: 
By shrinking good-ids we might remove elements from pos-list and current-queue. But, 
the elements in wu.queue corresponding to these elements are all marked (for current-queue 
remember that s.current-ok = false implies s.current-flag = marked), so by defining I to 
be the indices of these elements we both get that drop(J/) is enabled in w and that queue 
is handled correctly. 

status: 
Assume s satisfies condition A, B, or C(i) in Part 3. Then so does s’, so w’.status = 
u.status. This is allowed by drop(I) since in the cases (B and C(i)) where status # 
(false, OK) the element at the end of w.queue is not deleted. 
If s satisfies C(ii) then either s’ also satisfies C(ii) which is allowed since the element at 
the end of u.queue (which corresponds to current-queue is no deleted), or s’ satisfies C(vi) 
(it cannot satisfy C(iii)-C(v) because of Invariant 8.4 Part 7 and Invariant 8.10 Part 2) 
which is allowed by drop(/) since s.current-ok = false implies u.status.flag = marked. 


144 


8. The Generic Protocol G 


If s satisfies C(iii)-C(v), then so does s’, so u’.status = u.status. But this is allowed by 
since we in these cases have u’.queue = u.queue. 

If s satisfies C(v) then so does s’. In this situation the element at the end of u.queue 
might have been deleted (corresponding to elements being removed from pos-list, but 
since status = (false, OK), status is handled correctly. 

Finally, if s satisfies C(vii) then so does s’. If current-ack, = false then u’.status = 
u.status = (false,OK) which is allowed by drop(/). If current-ack, = true then Invari- 
ant 8.8 Part 4 implies that w'.queue = u.queue. Thus the element at the end of u.queue 
is not deleted, so it is permitted to leave status unchanged at (true, OK). 


Thus, (u, drop(1), u’) € steps( Ap). 


@ Ss. 


current-ok = true 


Again we claim that (w, drop(1), u’) € steps( Ap). 


The argument is similar to the previous case except that since current-ok = true, we have 
current-flag = OK, so it is not allowed to lose an element in current-queue or lose status in 
case C(ii). However, the precondition to shrink_good,(ids) ensures that these requirements 
are met. 


a = grow _good, (ids) 


The precondition ids M issued, = Q and the effect of grow_good,(ids) ensures that s’.good-ids = 
8.good-ids. 


Then it is easy to see that wu’ = u. Thus, the execution fragment uw has the right properties. 


a = cleanup, 


We show that u’ = wu. Then the execution fragment u has the right properties. We consider the 
four state variables of Ap. 


rec,, rec,, and queue: 


We obviously have u’.rec, = u.rec,, u’.rec, = u.rec,, and u’.queue = u.queue. 


status: 
Here 


the only problem would be that last, is changed. The variable last, only occurs in the 


conditions of Part 3 when mode, = send, so assume s.mode, = send. Then s.last, # s.last, 


from 


Now, 


the precondition. Since also s’.mode, = send, Invariant 8.1 Part 2 gives us s’.last, # nil. 
since s’.last, = nil, we also have s’.last, £ s’.last,. It is now easy to see that whatever 


condition in Part 3 that s satisfies, s’ satisfies the same condition. Thus, w’.status = u.status. 


This concludes the simulation proof. 


We can now prove that Ag safely implements Ap. 


Theorem 8.15 (Ac safely implements Ap) 


Aa Es 


Proof 


Ap 


Directly by Lemma 8.14 and the soundness of refinement mappings with respect to the safe 
implementation relation (Lemma 5.8). 
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8.5.3 Correctness 


We do not give a formal proof that G correctly implements D. Instead we provide some intuitive 
justification and refer to the formal proof that H correctly implements G which is similar. 

We first give two key lemmas about the live executions of G. We use our temporal logic to 
state the results but we only give informal proofs. These lemmas are then use to prove that G 
correctly implements D. 

The first lemma says that if we are in a situation where no crashes occur in the future, then 
whenever mode, = send, eventually the sender will move to idle mode. Note, that due to 
previous crashes the sender and the receiver do not necessarily agree on what identifiers to use. 
So, in some situations, the sender moves to idle mode because of negative acknowledgements 
from the receiver, in which case the current message might have been lost. 


Lemma 8.16 


Loe — A(G(mode, # rec \ mode, # rec) = > (mode, = send ~+» mode, = idle)) 


l.aéle 

2. a, is an arbitrary suffix of a 

3. a, FE O(mode, # rec A mode, # rec) 
4 

5 


. Q is an arbitrary suffix of a,. 
. A KE mode, = send 
PROVE: a2 F O(mode, = idle) 


We consider what happens in a,. Note that since mode, = send and no crashes occur, mode, will 
stay send unless one of the actions receive_pht,.,(last,, true) or receive_pkt,.,(last,, false) occurs, 
in which case mode, changes to idle. Furthermore, while mode, = send, last, is unchanged 
and the sender keeps performing send_pkt,,(m, last,). The latter is due to weak fairness to the 
set Coq s/ri containing send_pkt,,(m, last,) since all other actions in the set are never enabled. 
Now, it suffices to show that eventually receive_pht,,(last,, true) or receive_pkt,,(last,, false) 
occurs. 


(1)1. CASE: ay & last, ¢ good-ids 
(2)1. CASE: a» & (last,, true) € rs 


Proor: By the fairness of the rs channel, eventually a receive_pkt,.,(last,, true) 
action occurs. That suffices. 


(2)2. CASE: Qs — (last,, true) ¢ rs A last, = last, 


ProorF: In this situation the receiver has received the current packet but not yet 
sent positive acknowledgements. 

If buf, # ¢€, weak fairness to the set Co s/r3 implies that eventually buf, = e. 
Furthermore, buf, stays empty as long as the sender does not leave send mode. 
Now, when buf, = ¢, we have mode, € {idle,ack}. If mode, = idle, it changes 
to ack when a receive_pkt,.(m, last,) occurs. Since the sender keeps on sending 
(m, last,) packets, some will continue to get through (by channel liveness), so if 
mode, = idle, eventually mode, = ack. When mode, = ack the receiver will 
continue to perform send_pkt,,(last,, true). Such a step can, however, change mode, 
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to idle, but from above we have that eventually mode, = ack again and new 
send_pkt,.,(last,, true) steps will be performed. 

By channel liveness, eventually receive_pkt,.,(last,, true) occurs, and since last, = 
last,, the result follows. 


(2)3. CASE: a» & (last,, true) ¢ rs A last, # last, 


Proor: This case actually describes two situations: in the first situation the current 
packet never has been and never can be successfully received by the receiver. In the 
second situation the current packet has been successfully received but the receiver 
crashed before placing a positive acknowledgement in the channel. Both situations 
are dealt with in the same way. 

Every time a receive _pkt,,.(m, last,) step occurs, last, is placed into nack-buf,, which 
leads to a send_pkt,.,(last,, false) action (by fairness to the send_pkt,,(id, false) ac- 
tions). Since receive_pkt,,.(m, last.) continues to occur, send_pkt,.,(last,, false) con- 
tinues to occur. By channel liveness eventually receive_pkt,.,(last,, false) occurs. 
That suffices. 


(2)4. Q.E.D. 
Proor: By the exhaustive cases (2)1-(2)3. 
(1)2. CASE: a2 F last, € good-ids 


Proor: Then either always last, ¢ good, or eventually last, € good. 

If always last, ¢ good, then the situation is as described by the case above where ay — 
last, € good-ids A (last,, true) ¢ rs A last, # last,. 

If eventually last, € good,, then still the receiver might have issued send_pkt,.,(last,, false) 
actions in the meantime, and these packets could have gotten through to the sender in 
which case the result follows. So, if this is not the case, eventually (m, last,) is successfully 
received in which case the situation is as described by the case above where az — last, ¢ 
good-ids A (last,, true) ¢ rs A last, = last,. 


(1)3. Q.E.D. 
Proor: By the exhaustive cases (1)1—(1)2. 


The result now follows from Lemma 3.5 and the definition of ~. 

a 

The next lemma states that if there are elements in the four parts that make up the abstraction 
of a queue in Ap (cf. Definition 8.13), then eventually a receive_msg(m) action occurs. Thus, 


messages cannot be blocked in the G protocol. 
Below we use the notation receive_msg(_) to denote the set {receive_msg(m) |m € Msg}. 


Lemma 8.17 


La —- U(O(mode, # rec A mode, # rec A 
(buf, # € V pos-list # e V current-queue 4 ¢ V buf, # €©)) => O(receive_msg(_))) 


Proof 


ASSUME: l.aé€ Le 
2. a, is an arbitrary suffix of a 
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3. a, FE O(mode, # rec A mode, # rec A 
(buf, 4 € V pos-list # ¢ V current-queue # ¢V buf, # €)) 


PROVE: a, F O(receive_msg(_)) 


(1). 


(1)2. 


(1)4. 


CASE: a, F buf, #€ 


ProoF: The result follows by weak fairness to the set Cg s/r3- 


CASE: a, F pos-list # € 


Proor: The packets in pos-list represent “old” packets in the sr channel that might still 
successfully be received by the receiver since the packets all have identifiers in good-ids. 
Due to channel liveness (the weak fairness requirement on each packet), the packets in 
pos-list will eventually be received. Two situations can occur. 

First, a packet from pos-list is accepted because it has an identifier in good, at the time 
it is received. In this case the message of the packet is placed in buf,, and (1)1 gives the 
result. 

Second, no packets from pos-list are ever accepted. Then eventually pos-list becomes 
empty (no new packets can be added to pos-list since no crashes occur, and each packet in 
pos-list has only finitely many copies in sr and these will eventually all be received (but 
not accepted) and thus removed from sr). However, then one of the other disjuncts in 
Part 3 of the Assumption must be satisfied, so we refer to the other cases. 


. CASE: a, — current-queue # € 


(2)1. CASE: a, — current-ok = true 


ProorF: In this situation the sender either will (because of liveness on choose_id(id) 
actions) or has chosen a current identifier last, which is in good, (and stays there until 
the current packet is accepted). The sender will send the current packet repeatedly, 
so by channel liveness it will eventually be received and thus accepted. The message 
will be placed into buf,, and Case (1)1 gives the result. 


(2)2. CASE: a, — current-ok = false 


Proor: Here, due to the fact that the receiver was crashed during the last prepare 
action, the sender may choose an identifier which is not in good,. The sender will 
send the current packet repeatedly, and two things can happen. 

Either, the current packet will be accepted at some point by the receiver because 
last, was in good-ids initially and has been added to good, in the meantime. Then 
the message is placed in buf,, and Case (1)1 gives the result. 

Or, the current packet will never be accepted by the receiver. However, since the 
current packet will keep on being received by the receiver (due to channel liveness), 
the receiver will keep on issuing negative acknowledgements for the current iden- 
tifier last,. By channel liveness such a negative acknowledgement will eventually 
get through and move the sender to idle mode. This has the effect of emptying 
current-queue, so one of the other disjuncts in Part 3 of the Assumption must be 
satisfied, so we refer to the other cases. 


(2)3. Q.E.D. 
Proor: By exhaustive cases (2)1 and (2)2. 
CASE: a, F buf, #é 
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Proor: By Fairness to the set Ce s/ri, eventually a prepare action will occur. Since 
mode, # rec, the sender ends up in needid mode with current-ok = true. The result is 
now implied by the first subcase of Case (1)3. 


(1)5. Q.E.D. 
Proor: By exhaustive cases (1)1—(1)4. 


The result now follows from Lemma 3.5. 


With the two lemmas above we can prove the main ingredient in our liveness proofs, namely, if 
a is a live execution of G and a’ is an execution of Ap such that (a,a’) € Rep, then a’ is live. 
We prove the result by contradiction (cf. the similar lemma (Lemma 7.17) in the proof that D 
correctly implements S). Thus, we assume that a’ is not live and then derive a contradiction 
with the fact that a is live. 


Lemma 8.18 


Let a € exec(Ag) and a’ € exec(Ap) be arbitrary executions of Aq and Ap, respectively, with 
(a,a’) € Rap. AssumeaE Qg. Then a’ — Qp. 


Proof 


We prove the conjecture by contradiction. Thus, 


ASSUME: a’ - Qp 
PRovE: False 


(1)l. a —& AWF(Cpi, rec, = false \ rec, = false) V 
AWF (Cp, rec, = false A rec, = false) V 
AWF(Cp,3) V 
AWE (Cpa) 


Proor: Immediate by the Assumption, definition of Qp, and the Boolean operators. 
(1)2. Case: a’ EF AWF(Cpi, rec, = false A rec, = false) 

(2)1. a! E OO (status.stat € Bool A rec, = false A rec, = false) A 
OO ({ack( true), ack( false) }) 


Proor: By Assumption (1), the definitions of WF and Cp, and the fact that ack(b) 
actions are enabled when status.stat € Bool. 


(2)2. a E OO(mode, # rec \ mode, # rec A buf, =e A 
((mode, = send A last, = last, A buf, = ¢) V 
(mode, = send A last, # last, A (last,, true) € rs) V 
(mode, = send A last, # last, A (last,, true) ¢ rs A last, ¢ good-ids) V 
(mode, = idle))) A 
OO ({ack(true), ack( false) }) 


Proor: By (2)1, Lemmas 5.10 and 5.11, the definition of Rap, and the fact that 
ack(b) actions are external. 


(2)3. a E OO(mode, = idle A buf, = ¢) A OOn7({ack( true), ack( false )}) 
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Proor: By (2)2, Lemma 8.16, and the fact that when mode, becomes idle, it stays 
idle since no crashes occur and no prepare action can occur (since buf, = ¢ forever). 


. a OU(mode, = idle A buf, = €) A OO7(CG s/r1) 


Proor: By (2)3 since the ack(b) actions are in Cg s/-1 and no other actions in Cas /r1 
can occur when mode, = idle and buf, =e. 


~_ae AWF (Ca s/ri) 


Proor: By (2)4, the definition of WF’, and the fact that mode, = idle A buf, = « 
implies the enabling condition of Cs /r1. 


. QED. 


Proof: (2)5 contradicts the assumption that a is live. 


. CASE: a! EF AWE (Cp, rec, = false A rec, = false) 
(2)1. 


a’ — OO(queue 4 € A rec, = false \ rec, = false) A OO-7(receive_msg(_)) 


Proor: By Assumption (1), the definitions of WF and Cp», and the fact that Cp » 
is enabled when queue F ¢€. 


. aF OO(mode, ~ rec A mode, # rec A 


(buf, #¢ V pos-list 4 ¢ V current-queue # ¢ V buf, #e)) A 
©O-7(receive_msg(_)) 


Proor: By (2)1, Lemmas 5.10 and 5.11, the definition of Rap, and the fact that 
receive_msg(m) actions are external. 


. QED. 


Proof: (2)2 contradicts Lemma 8.17. 


. CASE: a! E AWF(Cp 3) 
(2)1. 


a’ & OO(rec, = true) \ OO-7(recover,) 


Proor: By expanding WF in Assumption (1). 


. a F OU(mode, = rec) \ OO7(recover,) 


Proor: By (2)1, Lemmas 5.10 and 5.11, the definition of Rap, and the fact that 
recover, is external. 


. a - OO(mode, = rec) A OO7(Ce s/r1) 


Proor: From (2)2 since recover, € Ca,s/r1 and none of the other actions in C@,s/r 
are enabled when mode, = rec. 


~_ae AWF (Ca s/ri) 


Proor: From (2)3, the definition of WF and the fact that mode, = rec implies the 
enabling condition for C's /r1. 


. QED. 


PRooF: (2)4 contradicts the assumption that a is live. 


. CASE: a! E AWE (Cpa) 


Proof: Similar to (1)4. 
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(1)6. Q.E.D. 
Proor: By (1)1 and the exhaustive cases (1)2—(1)5. 
| 


Finally, we can show that G correctly implements D. 


Theorem 8.19 
GC, D 


Proof 
Immediate by Lemmas 8.14, 8.18, and 5.9. 
a 


We are now ready to consider the two low-level protocols: the Five-Packet Handshake Protocol 
H and the Clock-Based Protocol C. The next chapter deals with H and then, in Chapter 10, we 
consider C. 


Chapter 9 


The Five-Packet Handshake Protocol 
H 


We have now reached the point where we can present the first of the low-level protocols we 
consider, namely, the Five-Packet Handshake Protocol of Belsnes [Bel76], which in this work is 
denoted by H. The H protocol is entirely distributed: it consists of a sender process, a receiver 
process, and two channels as depicted in Figure 9.1. 

H is the standard protocol for setting up network connections, used in TCP, ISO TP-4, 
and many other transport protocols. During normal operation it goes through three phases (cf. 
Figure 9.2): 


Agree on identifier: The sender picks an identifier, called jd to distinguish it from the identi- 
fier id used below for the actual communication of the message, and sends it in a needid 
packet. On receipt of this packet, the receiver pairs jd with a new identifier id, and sends 
the pair (jd, id) back to the sender. On receipt of this pair, the sender knows that it should 
associate id to the current message. 


Send and acknowledge: This phase is similar to the send/acknowledge phase of G. The 
sender sends the current packet in send packets, and the receiver acknowledges the receipt 
with ack packets. 


Clean up: When the sender has received the acknowledgement, it issues a done packet in order 
to inform the receiver that it may forget about the last message accepted. 


send_msg(m) receive_msg(m) 


d_pkt ive_pkt,,. 
send_pkt,, (p) Channel Cha receive_pkt ,,.(p) 
Pender ms Neceiver " 
recover s receive -pkt,..(P) Channel Ch,s send-pkt,(P) recovery 


Figure 9.1 


The Five-Packet Handshake Protocol H. 
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Sender Receiver 


needid 


Agree on identifier 


Send and acknowledge 


} Clean up 


Figure 9.2 
The phases of H. 


Below we look at different abnormal situations which can arise due to crashes. H is sometimes 
called the three-way handshake, because only three packet types are needed for message delivery 
(the first three in Figure 9.2). 


The rest of this chapter is organized as follows. Section 9.1 considers the channels in H. Then, 
in Section 9.2, we present the sender and receiver processes, and in Section 9.3 we show how H 
is obtained from the subprocesses. Finally, in Section 9.4 we prove that H correctly implements 


G. 


9.1 The Channels 


We use the same channels as at the G level (cf. Section 8.2). However, the actual packets that 
are communicated are different in H and G. This only means that in H we should instantiate 
the set P of possible packets with a different set of packets than in G. 


9.2 The Sender and the Receiver 


In this section we specify the sender and receiver processes as two live I/O automata H, = 
(Ay, £n,;) and H., = (Ay,, Ln,-), respectively. In the subsection defining steps(Ay,,) and 
steps( Ay) below, we provide more intuition about the H protocol. 


9.2.1 States and Start States 


The sender and receiver processes both contain a stable set of used identifiers. This means that 
these sets should survive crashes when implemented on a physical machine. Specifically, we 
model the stability of a state variable by not resetting it on recovery. 

For instance, the stable set issued, includes all identifiers ever considered “good” by the 
receiver. Thus, every time the receiver issues a new identifier id (to be sent to the sender in an 
accept packet) this should be remembered forever by adding id to issued,. This is an expensive 
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solution since it requires updates to a stable variable for every message. The fix to this problem 
would be to introduce a normal volatile (i.e., non-stable) variable unused, which is filled with 
new (i.e., non-issued,.) identifiers now and then in steps that update the stable variable issued, 
by adding these new identifiers. Then, for each message, the identifier can be chosen from 
unused, and no updates to stable variables need to be performed. Of course, unused, will be 
lost in crashes, so it should not be kept too big, but on the other side, the less identifiers it 
contains, the more frequently updates to the stable variable issued, needs to be performed. 
This is a typical trade-off. 

We do not consider the addition of the variable unused, to H,, but the changes needed are 
both few and simple. 


Sender 


The sender chooses identifiers jd from the set JD. This set is similar to the set 1D introduced in 
Section 8.1. We call it JD to distinguish it from /D, which are identifiers chosen by the receiver. 


(Variable [Type | tnitialy 


mode , {idle, The mode of the sender. Similar to the 
needid, mode of the sender at the G level. 
send, rec} 


| | Msg . fe —s|: The list of messages at the sender side. | list of | The list of messages at the sender side. | at the sender side. 


= U Se Ht jd chosen for the current message 
by the sender. 


A set including all the jds ever used by 
ay a Fa 
IDU{nil} | nil The id received from the receiver. Sim- 
pve ger 
current-msg, Msg U {nil} | nil The message about to be sent to the 
aoe cern reser Sate atthe Gee 
current-ack Bool false Acknowledgement from the receiver. 
a a a ec 
done-buf , ID* € A list of ¢ds for which the sender must 
a a eee 


S = Stable 
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Receiver 


[Variable |_| Type ___| Initially 


mode, {idle, i The mode of the receiver. Similar to 
accept, the receiver mode at the G level, except 
revd, ack, for the extra accept mode. In mode 
rec} accept the receiver is sending accept 
packets, which contain the chosen mes- 
sage identifier. 
fy 


Bia The list of messages accepted. Same as 
at the G level. 


pL JDU{nil} [nil | ‘The jd received from the 1 receiver. 


id, nt) [nt The i chosen for the received ji 
last, ID U {nil} This variable contains (when non- nil) 
Pr ime id of the last packet accepted. 
ee A set including all ids ever issued by 


the receiver. Same as at the G level. 


nack- a A list of ids for which the receiver 
will issue negative acknowledgements. 
Same as at the G level. 


S = Stable 


9.2.2 Actions 


Sender 


Input: 
send_msg(m), m € Msg 
crash. 
receive _pkt,, (accept, jd, td), jd € JD, id € ID 
receive_pkt,, (ack, td, b), id € ID, b € Bool 
Output: 
ack(), 6 € Bool 
TECOVER g 
send_pkt,,(needid, jd), jd € JD 
send_pkt,, (send, m, id), m € Msg, id € ID 
send_pkt,, (done, id), id € ID 
Internal: 
choose_jd(jd), jd € JD 
grow-jd-used,(jds), jds € P( JD) 


Receiver 


Input: 
crash, 
receive _pkt ,,(needid, jd), jd € JD 
receive_pkt,, (send, m, id), m € Msg, id € ID 
receive_pkt,, (done, id), id € ID 

Output: 
receive_msg(m), m € Msg 
TECOVEry 


send_pkt,,(accept, jd, id), gd € JD, id € ID 
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send_pkt,, (ack, id, 6), id € ID, b € Bool 
Internal: 


grow-issued, (ids), ids € P(ID) 


9.2.3 Steps 


We now formally define steps(Ay,,) and steps(Ay,,). As at the G level we increase readability 
by listing the definition of steps( Ay,,) in the left column and the definition of steps(Ay,,) in the 
right, and by aligning send-pkt with the corresponding receive-pkt. 

After the definition, we provide more intuition about how H works. 


send_msg(m) 
Effect: 
if mode, # rec then 
buf , := buf > m 


choose_jd(jd) 

Precondition: 
mode, = idle A buf, AeA 
gd € jd-used, 

Effect: 
modes, := needid 
jd, := jd 
jd-used, := jd-used, U {jd} 
current-msg, := head(buf ,) 
buf , := tail( buf .) 


send_pkt ,, (needid, jd) receive_pkt,.(needid, jd) 
Precondition: Effect: 
mode, =needid A jd, = jd if mode, = idle then 
Effect: mode, := accept 
none choose an ?d not in issued, 
jd, := jd 
id, := id 


issued, := issued, U {td} 


receive_pkt, ,(accept, jd, id) send_pkt,,(accept, jd, td) 
Effect: Precondition: 
if mode, # rec then mode, = accept A jd, = jd A id, = id 
if mode, = needid A jd, = jd then Effect: 
modes i= send none 
id, := id 


else if tds # id then 
done-buf , := done-buf , * id 


send_pkt,, (send, m, td) receive_pkt,,. (send, m, id) 
Precondition: Effect: 
mode, = send A current-msg, =m A td, = td if mode, # rec then 
Effect: if mode, = accept A id, = id then 
none mode, := rcevd 
buf, := buf, ~m 
last, := id 


else if last, ~ id then 
nack-buf,, := nack-buf,,° id 
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receive_pkt,, (ack, id, b) 
Effect: 
if mode; # rec then 

if mode, = send A id, = id then 
modes := idle 
current-ack, := 6 
jd, := nil 
id, := nil 
current-msg, := nil 

if b = true then 
done-buf , := done-buf,, ° id 


send_pkt ,, (done, td) 
Precondition: 
mode, # rec A done-buf, Ae A 
head(done-buf ,) = id 
Effect: 
done-buf , := tail(done-buf ,) 


ack(b) 
Precondition: 
mode, = idle A buf, =e A 
current-ack, = b 
Effect: 


none 


crashs 
Effect: 


modes, i= rec 


rECOVET g 

Precondition: 
modes; = rec 

Effect: 
modes := idle 
jd, := nil 
id, := nil 
buf = € 
current-msg, i= nil 
current-ack, := false 
done-buf , := € 


grow-jd-used ,(jds) 
Precondition: 
| JD \ (gd-used, U jds)| = 00 
Effect: 
jd-used, := jd-used, U jds 
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receive_msg(m) 

Precondition: 
mode, =rcvd A buf, #eA 
head(buf,.) =m 

Effect: 
buf, := tail(buf,) 
if buf, =e then 

mode, := ack 


send _pkt,, (ack, id, true) 
Precondition: 
mode, = ack A last, = id 
Effect: 
none 


send_pkt., (ack, id, false) 
Precondition: 
mode, # rec A nack-buf, Ae A 
head(nack-buf,.) = id 
Effect: 
nack-buf, := tatl(nack-buf, ) 


receive_pkt ,,.(done, td) 
Effect: 
if (mode, = accept A id, = id) V 
(mode, = ack A last, = id) then 


mode, := idle 
jd, := nil 
id, := nil 


last, := nil 


crash, 
Effect: 
mode, := rec 
recovery 
Precondition: 
mode, = rec 
Effect: 
mode, := idle 
jd, :=nil 
id, := nil 
last, := nil 
buf, = € 


nack- buf, := € 


grow-issued, (ids) 
Precondition: 
|LD \ (issued, U tds)| = co 
Effect: 
issued, := issued, U ids 
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The following note about the receive_pkt,,,(needid, jd) steps should be made: Ay, is required 
to be input-enabled and therefore we do not specify preconditions for input actions. However, 
in the effect clause of receive_pkt,.(needid, jd) we must choose an id not in isswed,. But this 
is only possible if issued, 4 ID. However, Invariant 9.11 Part 8.12 below states that this is 
indeed the case for all reachable states. However, since there exists (non-reachable) states with 
issued, = ID, Ay, is not input-enabled. This is not a problem in practice, but to make Ay, 
input-enabled we interpret the definition of receive_pkt,,(needid, jd) such that an arbitrary id 
is chosen if issued, = ID. 


We first describe the normal mode of operation: the sender performs a choose_jd(jd) action 
(which corresponds to prepare of G) and moves to mode needid, where it repeatedly sends 
(needid, jd) to the receiver. By channel liveness these packets will continue to get through. 
One of the major problems in the liveness proof below is to show that eventually the receiver 
will be in idle mode. When this happens, the receiver accepts (needid, jd), associates a new 
identifier id with jd, and moves to accept mode, where it repeatedly issues (accept, jd, id) 
packets. Again by channel liveness, such a packet gets through and since jd is equal to the 
current jd (kept in jd,) of the sender, the sender accepts this packet. The value jd is no longer 
needed, but id is used for the actual communication. 

On receipt of (accept, jd, id) the sender moves to mode send. Note how the accept packets 
work as acknowledgements for the needid packets. In send mode the sender repeatedly sends 
the current packet (send, m, id). When one gets through, it is accepted since the zd in the packet 
corresponds to the current id (kept in id,) of the receiver. The message m is placed in buf, 
and the identifier 7d for which the receiver shall eventually issue positive acknowledgements is 
remembered in the last, variable. (Note the difference between id, and last,: id, remembers 
the identifier that the receiver will accept, whereas last, remembers the identifier for which the 
receiver must issue positive acknowledgements. Due to this difference the identifiers are kept in 
separate variables.) Now, eventually m is delivered to the user and the receiver moves to ack 
mode. Note how the send packets work as acknowledgements for the accept packets. 

In ack mode the receiver repeatedly sends positive acknowledgements in (ack, id, true) pack- 
ets. When one gets through, the sender leaves send mode and issues a positive acknowledgement 
ack(b) to the user at the sender side. 

The receiver has no knowledge of whether an (ack, id, true) packet has gotten through yet 
or not, so it continues to issue the packets. Somehow the receiver must be informed that the 
sender has received the acknowledgement. The done packets are used for this purpose. It 
would not work if the sender entered a mode where it repeatedly issued done packets because 
then the receiver would have to acknowledge the receipt of a done packet, and so on. Instead, 
every time the sender receives (ack, id, true) it adds id to done-buf,, and this leads to one 
send_pkt ,.(done, id) being issued. There is no guarantee that the packet is not lost, but if it 
is, the sender will eventually receive another (ack, id, true) packet, which gives rise to another 
send_pkt ,.(done, id) step. This cannot go on forever because of channel liveness, so eventually 
the receiver will receive (done, id) and since id is equal to last,, the receiver leaves ack mode 
and moves to idle mode, where it is allowed to forget everything about jd,, id,, and last,. 


The above discussion has concentrated on normal mode of operation, where the sender and 
receiver are synchronized. However, because both the sender and the receiver have modes where 
they repeatedly send certain packets and await acknowledgements, they would be very vulnerable 
to crashes of the other node if we did not have some means of informing the node about crashes. 
The “bad” modes are accept for the receiver and send for the sender. 
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First consider a situation where the receiver is in accept mode but where the sender due to 
crashes is not in the expected needid mode with jd, = jd,. The sender could be in idle mode 
or even in needid mode with a new jd identifier such that jd, 4 jd,.. Now, every time the sender 
receives a bad accept packet, it places the associated identifier id in done-buf , which leads to a 
send_pkt ,.(done, id) step, which may or may not succeed in putting the packet into the channel. 
If it succeeds, the packet will eventually be received and the receiver will be dislodged (cf. the 
definition of the receive_pkt,,.(done, id) steps of the receiver). If it does not succeed, the sender 
will eventually receive another accept packet, which gives rise to another send_pkt,,.(done, id) 
step. This cannot go on forever because of channel liveness, so eventually the receiver will 
receive (done, id). Thus, the done packets are used to inform the receiver to leave a bad accept 
mode in the same way done packets were used during normal mode of operation to inform the 
receiver that the sender has received the positive acknowledgement. An additional problem 
arises because the receiver immediately could receive an old needid packet and thus reenter a 
bad accept mode. However, there can only be finitely many such old needid packets in the 
channel, so this cannot go on forever. Below we shall see how this is proved formally. 

Another “bad” situation occurs when the sender is in send mode but where the receiver 
due to crashes is not in the expected accept mode with id, = id,. The receiver could be in 
idle mode or it could have received an old needid packet and thus be in accept mode with 
id, # id,. Now, every time the receiver receives a (send,m, id) packet it will, since id # id,., 
add id to nack-buf,, which leads to send_pkt,,(ack, id, false). This continues, as for the done 
packets above, until (ack, id, false) is receiver by the sender and at that point the sender resets 
to idle mode. 


The actions grow-jd-used,(jds) and grow-issued,(ids) allow identifiers to be added to the sets of 
used identifiers of the sender and receiver, respectively, as long as there are still “enough” (i.e., 
infinitely many) unused identifiers left. These actions are not required for the correctness of H 
but allow a final implementation on a physical machine to throw away some identifiers. This is 
typically required by algorithms for generating unused identifiers. 


It may seem strange that the sender and receiver need to engage in the initial needid/accept 
handshake. Why don’t they just agree on using, say, the natural numbers in increasing order 
as identifiers? Then the receiver will only accept a message if the associated identifier is greater 
than the identifier of the last message accepted. The answer is that H is designed so that the 
receiver can use the same set of identifiers for several senders. Thus, as defined, the sender does 
not have to remember (in stable storage!) the last identifier used by each individual sender. We 
do not in this report show how the receiver should work for several senders. 


The discussion above has partly been based on liveness assumptions on the sender and receiver. 
We now consider how to specify this liveness formally. 

9.2.4 Liveness 

Sender 


We define the following two sets of the locally-controlled actions of the sender: 


Cus =  {ack(true), ack( false), recover, }U 
{send_pkt ,,(needid, jd) | jd € JD} U 
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{send_pkt,,.(send, m, id) |m € Msg A id € ID} 
Cis. = {send_pkt,,(done, id) | id € ID} 


The liveness formula Qy,, that induces the liveness condition Ly,, for Ay,, is now defined as 
Qu,s = WF (Cy,s1) A WF (Cy,52) 


Note, that the reason we need weak fairness to Cy... separately is that sending of done packets 
can occur at any time. Then, if we only had weak fairness to Cy,5; U Cu,s2, there would be no 
requirement to issue done packets if the sender is in send mode and keeps sending send packets. 
This would not lead to correct operation of H. 

Thus, H, can intuitively be seen as consisting of two parallel processes: one dealing with 
the actions in Cy,,; and one dealing with issuing done packets. Since the liveness requirements 
are weak fairness, the liveness of H, can be implemented on a physical machine by a scheduler 
giving fair turns to the two parallel processes. 


By Lemma 4.7, Qy,, is an environment-free liveness formula for Ay,,. Thus, H, is a live I/O 
automaton. Furthermore, by Lemma 4.8, Qy,, is stuttering-insensitive. 


Receiver 


We define the following two sets of locally-controlled actions of the receiver: 


Chri = f{recover,} U 
{receive_msg(m) |m € Msg} U 
{send_pkt,.,(accept, jd, id) | jd € JD A id € ID} U 
{send_pkt,,(ack, id, true) | id € ID} 

Cur =  {send_pkt,,(ack, id, false) | id € ID} 


The liveness formula that induces the liveness condition for the receiver of H can now be ex- 
pressed as 


Qu,r = WF(Cy91) A WE (Cyr) 


The reason why we need weak fairness to two sets of actions is similar to the reason given above 
for the sender. 


By Lemma 4.7, Qy, is an environment-free liveness formula for Ay,. Thus, H, is a live I/O 
automaton. Furthermore, by Lemma 4.8, Qy,, is stuttering-insensitive. 


9.3. The Specification of H 


As depicted in Figure 9.1, H consists of the sender and receiver processes and the two channels. 
So, first define H” = (Aj, £4) to be the following live I/O automaton. 


HW” =| H,||H,||Ch.,||Ch,s 


Since Qs, Qur> Qcn sr, and Qcn rs are all stuttering-insensitive, Proposition 4.4 implies that 
Ly, is induced by 


Qu = Qu,s A Qu,r A Qchysr A Qchyrs 
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By Definition 2.2 the channel actions send_pkt,,(...), receive_pkt,,.(...), send_pkt,,(...), and 
receive_pkt,.,(...) are all output actions of H”. We need to hide these in order to get a live I/O 
automaton with the same external actions as 5. 

However, recall from Lemma 5.10 that the existence of an index mapping between execu- 
tions at two levels of abstraction allows one to conclude certain properties of the (common) 
external actions of the executions. Thus, the more external actions of two levels, the stronger 
the correspondence between the executions. 

At the G level we defined G’ to be the system where channel communication is external, i.e., 
G’ was simply the parallel composition of the sender/receiver process and the channels—similar 
to H” above. Now, the actions send_pkt,,(m, id), receive_pkt,,.(m, id), send_pkt,,(id,b), and 
receive_pkt,.,(id, 6) of G’ correspond to the send_pkt (send, m, id), receive_pkt,,(send,m, id), 
send_pkt,,(ack, id,b), and receive_pkt,.,(ack, id,b) actions at the H level. Thus, the channel 
actions at the H level which deal with needid, accept, and done packets do not correspond to 
any external actions of G’. Thus, we first hide these actions from H” to get H’. Let 


Ai, =  {send_pkt,,(needid, id) | id € ID}U 
{receive_pkt,,(needid, id) | id € ID} U 
{send_pkt,.,(accept, jd, id) | jd € JD Aid € ID} U 
{receive_pkt,,(accept, jd, id) | jd € JD A id € ID} U 
{send_pkt,,(done, id) | id € ID} U 
{receive_pkt (done, id) | id € ID} 


Then H’ = (Ajj, L4,) is defined as 
wo 2 W’\ Ay 
By Proposition 4.5, L, is induced by Quy. 
Finally, to get the H protocol, we hide the remaining channel actions. Let 


Ay =  {send_pkt,,(send,m, id) |m € Msg A id € ID}U 
{receive_pkt,,(send,m, id) |m € Msg A id € ID} U 
{send_pkt,.,(ack, id,b) | id € ID A b € Bool} U 
{receive_pkt,.,(ack, id,b)| id € ID A b € Bool} 


Thus, H = (Ay, Ly) is defined as 
H = W’\Ag 
Again, by Proposition 4.5, Dy is induced by Quy. 


Now, in the proof below we prove that H’ correctly implements G’ (or actually a slightly different 
version of G’ in which the channel actions are renamed to completely match the (remaining) 
external channel actions of H’). Then the substitutivity results of Proposition 2.16 are used to 
infer that H correctly implements G. 


9.4 Correctness of H 


The correctness of H with respect to G is now considered. We first add history variables to H’ 
to get H’’ = (AK, L4’) as described in Section 5.1.5. Then we state some invariants of A4’ and 
show the existence of a refinement mapping from An to Af’, where A&,’ is a slightly modified 
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version of Ag obtained by renaming some channel actions. This refinement mapping is then 
used to show that H®’ correctly implements G?’, which, in turn, allows us to conclude that H 
correctly implements G. 


9.4.1 Adding History Variables to H’ 


We add three history variables to H’ and denote the resulting live I/O automaton by H”’ = 
(Afr, Li). 


[wari [ype rsa 


used, A history variable giving the list of zds 
ever used by the sender (and thus ac- 
cepted in accept packets from the re- 
ceiver). Same as at the G level. 


ste P(JD x ID) A history variable consisting of all the 
(jd, id) pairs the receiver has ever seen. 


current- a | false A history variable describing the state 
of the current message. Same as at the 
G level. 


H = History 


By the results in Section 5.1.5, we are allowed to change the history variables anywhere in the 
effect clauses of the step rules defining the steps of Aj;. The effect clauses of step rules of Aj, 
are, in turn, defined by the corresponding effect clauses of the components of H’ as described 
in Section 4.1.1.1. We show where the changes to the history variables should be placed in the 
effect clauses. We omit the assignments to the original variables (by writing ... instead) but 
outline the if-then-else statements. 


choose_jd(jd) 
Precondition: 
(* Precondition from H, *) 
Effect: 
(* Effect clause from H. *) 


if mode, # rec then 
current-ok := true 


receive_pkt,.(needid, jd) 
Precondition: 
(* Precondition from Cha, *) 
Effect: 
(* Effect clause from Chg, *) 


(* Effect clause from H, *) 
if mode, = idle then 


seen; := seen, U {(gd,., td,-)} 


162 


receive_pkt, ,(accept, jd, td) 
Precondition: 
(* Precondition from Ch,s *) 
Effect: 
(* Effect clause from Ch,s *) 
(* Effect clause from H, *) 


if mode; # rec then 
if mode, = needid A jd, = jd then 


used, := used, id 


else if tds # id then 


crash. 
Effect: 
(+ Effect clause from H, *) 


current-ok := false 
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receive_pkt,,.(send, m, id) 
Precondition: 
(* Precondition from Chg, *) 
Effect: 
(* Effect clause from Chg, *) 
(+ Effect clause from H. *) 


if mode, # rec then 
if mode, = accept A id, = id then 


if id = id, then 
current-ok := false 


else if last, A td then 


crash, 
Effect: 
(+ Effect clause from H, +) 


current-ok := false 


From Lemma 5.16 we know that Lh! is induced by Qy. 


9.4.2 Invariants 


To help us in the refinement mapping proof below, we state some invariants of An without proofs. 
The proofs could be performed similarly to the proofs of the Aq invariants in Appendix C. 


The first invariant states properties of issued,. 


Invariant 9.1 


1. If id, # nil then id, € issued, 


2. If last, # nil then last, € issued, 


3. If (accept, jd, id) € rs then id € issued, 


4. used, C issued, 
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Define in any state of An jds(sr) to be the set of jd components of the packets in the sr channel. 
Formally, since only needid packets have jd components in the sr channel, we have 


jds(sr) = {jd | (needid, jd) € sr} 
Similarly, 
jds(rs) = {jd | (accept, jd, id) € rs} 
The following invariant then states that all jds in the system are used by the sender. 
Invariant 9.2 
l. jd, € jd-used, if jd, # nil 
2. jds(sr) C jd-used, 
3. jd, € jd-used, if jd, Anil 
4. jds(rs) C jd-used, 
| 


The following invariants state simple properties. 


Invariant 9.3 
1. If mode, € {idle, accept} then last, = nil 
| 


Invariant 9.4 
1. If mode, = accept then id, # nil 
| 


Invariant 9.5 
1. If mode, = rec V mode, = rec then current-ok = false 


Invariant 9.6 
1. If id, A nil then mode, € {send, rec} 
a 


The next invariant states the identifiers in the system are in most cases registered in the history 
variable used,. 


Invariant 9.7 


1. If id, # nil then zd, € used, 


2. If (send, m, id) € sr then id € used, 
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3. If mode, = rcevd then last, € used, 
4. If mode, = ack then last, € used, 


5. If (ack, id,b) € rs then id € used, 
| 


The identifiers for which the sender will issue or has issued done packets can never be equal to 
the current identifier of the sender. 


Invariant 9.8 
1. If id € done-buf , then id F# id, 
2. If (done, id) € sr then id F id, 
| 


The history variable seen, records all the (jd, id) pairs the receiver has ever seen. Thus, when 
the receiver associates an identifier id to a received jd, the pair (jd, id) is added to seen,. Due 
to crashes the receiver might associate two different id identifiers to the same jd identifier. 
However, it can never happen that the receiver associates the same id to different jds. 


Invariant 9.9 
1. If id, Anil then (jd,, id,) € seen, 
2. If (jd, id) € seen, A (jd', id) € seen, then jd = jd’ 
3. If (accept, jd, id) € rs then (jd, id) € seen, 


Invariant 9.10 


1. If mode, = needid A mode, = accept A jd, = jd, then 
(send, _, id.) € sr A (done, id,) ¢ sr 


The final invariant corresponds to Invariant 8.12 at the G level. It states that there are always 
enough unused ids and jds left. 


Invariant 9.11 
1. [LD \ issued,.| = 00 
2. |JD \ jd-used,| = 00 


Below we refer to the conjunction of the invariants by Jy. 
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9.4.3 Safety 


The safe I/O automata Ab! and AG do not agree on their input and output actions. The 
difference is however very small: An adds packets to the channel in send_pkt,,(send,m, id) 
steps, whereas the corresponding steps in AG are send_pkt,,(m, id). There is a similar difference 
with respect to send_pht,,(ack, id, b) steps and the corresponding receive_pkt,,, and receive_pkt.., 
steps. So, define the following action mapping: 


send_pkt ,,.(m, id) + send_pkt,,(send, m, id) |m € Msg A id € ID] U 
receive_pkt,,.(m, id) + receive_pkt,,(send,m, id) |m € Msg A id € ID] U 


p = [ 
[ 
[send_pkt,,(id,b) ++ send_pkt,,(ack, id,b) | id € ID A b € Bool] U 
[ 
[ 


receive_pkt,,(id, b) + receive_pkt,,(ack, id, b) | id € ID A 6 € Bool] U 
ara|a€é acts(Ag) \ Ae] 


where Ag is defined in Section 8.4 and contains all the actions which are not being renamed by 
p. Clearly p is applicable to G’, so define G’’ = (A&’, L2.’) as follows. 


Gr = AG) 
By Proposition 4.6, £2’ is induced by p(Qa). 


We now define a function from states( A’) to states( Af). Below, in Lemma 9.13, this function 
is proved to be a refinement mapping from Ab to A?’ with respect to Iy. and Ig. (Note, that 
the invariant I< of Aq is also an invariant of Ag,’.) 


Definition 9.12 (Refinement Mapping from A’,’ to A%,’) 
If s € states( AX’) then define Ryg(s) to be the state u € states( A?) such that 


1. u.mode, = s.mode, 
u.buf , = s.buf, 
aw.used, = s.used, 
u.current-msg, = s.current-msq, 
u.current-ack, =  s8.current-ack, 
u.last,. = s.last, 
u.buf,. = s.buf, 
uw.issued, = s.issued, 
u.nack-buf ,. = s.nack-buf, 
u.current-ok = s.current-ok 
2. u.last, = s.1d, 


(if s. mode, = needid then 

{id | (accept, s.jd,, id) € s.rs}U 

(if s.mode, = accept A s.jd, = s.jd, then {s.id,} else 0) 
else ()) 


3. u.good, 


4, u.mode,. (if s.mode, = accept then idle else s.mode,) 
5. u.good,, = (if s.mode, = accept then {s.id,} else 0) 


6. The packets in each channel in u are exactly the send and ack packets in the same channel 
in s. 
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Lemma 9.13 


/ i . 
Ap <R Ae vid Rua. 


Proof 


We prove that Ryq is a refinement mapping from An to A®’ with respect to [yp and Ig. We 
check the two conditions (which we call base case and inductive case, respectively) of Defini- 
tion 5.2. 


Base Case 
It is easy to see that for the start state s of Ak’, Rap(s) is a start state of AQ’. 
Inductive Case 


Assume (s,a,s’) € steps(Ah’) such that s and s' satisfy Ij. and Rya(s) satisfies Ig. Below 
we consider cases based on a (and sometimes subcases of each case) and for each (sub)case we 
define a finite execution fragment a of A%’ of the form (Rya(s),@’,u",a",ul”,..., Rua(s’)) with 
trace(a) = trace(a). For brevity we let u denote Ryg(s) and wu’ denote Ryc(s’). 


Unless otherwise stated we let Part 1-6 refer to the three parts of Definition 9.12. 


a € {send_msg(m), receive_msg(m), ack(b), recover, } 


Then it is easy to see that (u,a,u’) € steps(A?,’). 


a = crash, 


We show that (u, crash,, u", shrink_good ,(1),u’), where uw” and I are defined below, is a finite 
execution fragment of A?’ by showing that (u, crash,, wu’) and (w’, shrink_good,(I),u’) are steps 
of A?,". Clearly the execution fragment has the right trace. 


Define wu” to be the same as w’ except that w”.good, = u.good,. Then it is easy to see that 
(u, crash,, ul) € steps( Ag’). 


Now, if s.mode, = needid then w.good, might be nonempty whereas w’.good, = ( according 
to Rug. So, define I = u’.good,. (Note, I = 0 if s.mode, # needid.) Then, obviously, 
(u!", shrink_good,(I), u') € steps(Ag,'). 


a = crash, 


We show that (u, crash,, wu", shrink_good,(1),u’), where I = u.good, and wu” is defined below, is 
a finite execution fragment of A?’ by showing that (u, crash,,u!) and (w", shrink_good,(I), u’) 
are steps of A’,’. Clearly the execution fragment has the right trace. 


Define u” to be the same as u’, except that u”.good, = u.good,. 


It is easy to see that (u, crash,,ul) € steps(Af,’). The only interesting case is to show that 
good, is handled correctly but from the definition of u” we have u”.good, = u.good,, which is as 
required. 
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Since u’.mode, = rec, we get from Invariant 9.5 that u’.current-ok and then also u”.current-ok 
are false, so shrink_good,(1) is enabled in wu”. The only difference between w” and w’ is the value 
of good,. We have u".good, = I and u’.good, = since s'’.mode, = rec # accept. This change 
in good, is as required by the definition of shrink_good,(I) in Ag,". 


a = recover, 


We show that (u, recover,,u’) € steps( Af’). This step (and finite execution fragment) clearly 
has the right trace. 


First note that recover, is enabled in u. We then carry out a case-by-case check to see that all 
state variables change appropriately. The only interesting cases are good, and issued,. 


Both u.good, = 0 and u’.good, = ) by the definition of Rug since mode, # accept in s and s’. 
Thus, the value of good, is unchanged as required by the definition of recover, in AQ’. 


From the definition of recover, in An and Ryq we have that u.issued, = u'.issued,. To show 
that it is allowed by recover, in Ab to leave issued, unchanged, we must show that u.used, C 
u.issued, and u.good, C u.issued,. But both of these requirements follow directly from the 
definition of Ryq and Invariant 9.1. 


a = choose_jd(jd) 


We show that (u, prepare, u’) € steps(A?,’). This step (and finite execution fragment) clearly 
has the right trace. 


Since choose_jd(jd) is enabled in s and u = Ryc(s), it is immediate that prepare is enabled in uw. 
A case analysis on the variables of A?,’ shows that all are modified properly; the only interesting 
case is that of good,. There, the definition of prepare in A®,’ requires that u/.good, = 0. We 
must show that that is the case: 


First, assume s.jd, = nil. By the definition of choose_jd(jd) in Ab we have s’.jd, # nil, so 
since s'.jd, = s.jd,, we have s’.jd, # s'.jd,. 

Now assume s.jd, # nil. Then Invariant 9.2 gives us that s.jd, € s.js-used, and since s’.jd, = 
s.jd, we have s'.jd, € s.js-used,. By the definition of choose_jd(jd) in At! we have s'.jd, ¢ 
s'.jd-used ,, so also in this case we get s’.jd, 4 8’ .jd,. 

From Invariant 9.2 we get jds(s.rs) C s.jd-used,. By the definition of choose_jd(jd) in An we 
have s’.jds(s'.rs) = jds(s.rs) and s’.jd, ¢ s.jd-used,, so we get s’.jd, ¢ jds(s’.rs). 

Finally, since s’.jd, # s’.jd, and s'.jd, ¢ jds(s’.rs), we get from the definition of Ryq that 
u'.good, = J as required. 


a = send_pkt ,.(needid, jd) 


We show that u’ = u. Then the execution fragment u of AQ,’ clearly has the right properties. 


The only difference between s and s’ is that s contains an additional needid message in the 
sr channel. But this does not affect the values of any of the variables of A?,’ according to the 
definition of Rya. 


a = receive_pkt ,,(needid, jd) 


We consider two cases. 
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1. s.mode, # idle. 


Then the only difference between s and s’ is that the latter is missing one needid packet 
from the sr channel. But this does not affect the values of any variables of A’,’, so that 
u' = u. Then the execution fragment u of A%,’ clearly has the right properties. 


2. s.mode, = idle. 


There are two subcases. 


(a) s.mode, # needid or jd # s.jd,. 
We show that (wu, grow_good,({id}),u’) € steps(A?,), where id is the identifier chosen 
in the step of AM, ie., id = s’.id,. Clearly the step has the right trace (the empty 
trace). 
The definition of the step in An implies that id ¢ s.issued,. From the definition of 
Rug we have u.issued, = s.issued,, so that grow_good,({id}) is enabled in uw. 
We consider the state changes. From the definition of Ryq we have u.good, = 0 
and u’.good, = {id}. This is the change to good, specified by the definition of 
grow_good,({id}). Also, the step of At! explicitly adds id to issued,, which is as 
required by the definition of grow_good,({id}) in Ag’. 
We claim that all variables of A?,’ other than good, and issued, have the same values 
in wand u’. This is immediate for mode,, buf ,, used,, current-msg,, current-ack,, 
buf,., last, nack-buf,, current-ok, and last,. For mode,, we have a change at the H 
level, from idle to accept. But both of these correspond to idle at the G level. 
We now show that w.good, = u'.good,. We make a case analysis based on the definition 
of this case. First assume s.mode, # needid. Then also s’.mode, # needid so from 
the definition of Rug we have u.good, = u'.good, = @ as needed. 
Now, assume s.mode, = needid and jd # s.jd,. Since s'.jd, = jd and s’.jd, = s.jd, 
we get s’.jd, # s'.jd,, so even though mode, changes to accept in An, it is easy to 
see from the definition of Ryg that u.good, = u’.good,. 
Finally, the only difference between the channels in s and s’ is that the sr channel in 
8’ is missing one needid packet. But then the values of the channels in wu and w’ are 
the same. 

(b) s.mode, = needid and jd = s.jd,. 
We show that (u, grow_good,({id}), wu”, grow_good,({id}), wu’), where wu” is defined be- 
low and id = s'.id,, is a finite execution fragment of A’,’. We do this by showing 
that (u,grow_good,({id}), ul") and (u’, grow_good,({id}),u’) are steps of A’. The 
execution fragment clearly has the right trace. 
Define u” to be the same as u’, except that u’.good, = u’.good, \ {id}. 
The argument that (u, grow_good,({id}), wu) is a step of A,’ is the same as the argu- 
ment for the previous case, except for the part about good,. Here, u.good, = u’.good, 
by explicit construction. 
To show that (w’, grow_good,({id}), u’) is a step of A?,’, it suffices to note that id € 
u” issued,, id € u".good,, and id € u.used,. (This latter claim uses Invariant 9.1.) 


a = send_pkt,,(accept, jd, id) 


We show that u/ = u. Then the execution fragment u of A%,’ clearly has the right properties. 


The only difference between s and s’ is that s’ contains an additional accept message in the sr 
channel. We claim that this does not affect the values of any of the A®,’ variables. 
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The only interesting case to check is the value of good,. The only way the step can modify 
this variable according to Ryg is to add an id to good,, by putting id, to good,, by putting 
an (accept, s’.jd,, id) message into the rs channel. By definition of the step in H, it must be 
that s’jd, = s.jd, and id = s.id,. Since s.jd, = s'.jd,, it follows that s.jd, = s.jd,. But then 
id € u.good,. This contradicts the assumption that the step modified this variable. 


a = receive_pkt,,(accept, jd, id) 


There are two cases. 


1. s.moede, = rec 


In this case the only difference between s’ and s is that s has an extra (accept, id, jd) 
packet on rs, but from the definition of Rug we see that this does not affect any of the 
variables in A?,’ since s.mode, # needid. Thus uw’ = u. The the execution fragment u of 


pf 
AG 


has the right properties. 


2. s.mode, # rec 


We consider cases 


(a) 


s.mode, # needid or jd F s.jd,. 

We show that u’ = u. The the execution fragment u of A’,’ has the right properties. 
The only difference between s and s’ is that s’ removes a single accept message in 
the sr channel and that done-buf, might be updated. We claim that this does not 
affect the values of any of the A%,’ variables; the only interesting case to check is that 
of good ,, and there, the fact that s.mode, # needid or jd # s.jd, implies that good, 
has the same value in wu and w’. 

s.mode, = needid and jd = s.jd,. 

We show that (u, choose_id(id), wu”, shrink_good,(I), u'), where I = u.good, and wu” is 
defined below, is an execution fragment of A?’ by showing that (u, choose_id(id), wu’) 
and (u", shrink_good,(I),u’) are steps of A’. Clearly the execution fragment has the 
right trace. 

Define wu” to be the same as u’ except that u.good, = I. 

First consider (wu, choose_id(id),u"’). Since s.mode, = needid, we have u.mode, = 
needid. Then, to prove that choose_id(id) is enabled in u, we need to show that 
id € u.good,. In s, we have (accept, id, jd) in the rs channel, and moreover jd = s.jd,. 
Thus, from the definition of Rug we have id € u.good, as needed. 

Now we consider the effects on the variables in A?,’. A case analysis shows that the 
changes reflected in wu” are as specified by the step of A®,’. The only interesting case is 
that of good,, where the definition of u”.good, = I = u.good, ensures that the value is 
unchanged, as required by the definition of choose_id(id) in A®,". 

To see that (u’, shrink_good,(I),u’) is a step of A?,’, note that u’.good, = 0. Therefore, 
the changes are as required by the definition of shrink_good,(I) in Ag,’ 


a = send_pkt ,.(send, m, id) 


Then it is easy to see that (u,send_pkt,.(m, id), u’) € steps(A®,). This step (and execution 
fragment) clearly has the right trace. 
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a = receive_pkt,,(send, m, id) 


We show that (u, receive_pkt,,.(m, id), u’) € steps( Af,’). This step (and execution fragment) has 
the right trace. 


We consider four (exclusive and exhaustive) cases. 


1. s.mode, = rec. 
Then the only change from s to s’ is the removal of the single message from the sr channel. 
Since also u.mode, = rec, this corresponds to the right change in AQ,’ 


2. s.mode, = accept and id = s.id,. 
Then, from the definition of Rug we have that u.mode, = idle and id € u.good,, such that 
the required state change of the receiver variables of A?,’ is described by the first alternative 
in the nested if-then-else construct in the step rule for receive_pkt,,.(m, id). A case analysis 
shows that all variables of A®,’ are handled correctly. The interesting cases are current-ok 
and good,. 
For current-ok, we consider two cases. 
First, if id = s.id,, then we have id = u.last,. Moreover, s.mode, € {send,rec} by 
Invariant 9.6. If s.mode, = rec then Invariant 9.5 implies that s.current-ok is already false, 
so setting it to false in An is a no-op, as required by the step in A?’. If s.mode, = send 
both algorithms set current-ok to false. 
On the other hand, if id # s'.id,, then also id # u.last,. Thus in this case neither level 
changes current-ok. 
For good,, note that u.good, = {s.id,} since s.mode, = accept and u’.good, = @) since 
s'.mode, # accept. Since id = s.id,, this change is as required by the definition of 
receive_pkt,,.(m, id) of A,’ 

3. s.mode, # rec and (s.mode, # accept or id # s.id,) 
We show that the required state changes of the receiver variables of A?,’ are not described by 
the first alternative inside the nested if-then-else construct. First, if s.mode, # accept then 
u.good,, = 9 which gives the result. Next, if s.mode, = accept we have u.good, = {s.id,}, 
but from the definition of this case we must have id # s.id,, so again the result follows. 


We now consider two cases 


(a) id # s.last, 
Then we have s’.nack-buf,, = s.nack-buf,, id. Since id 4 u.last,, by the definition of 
Rua, we also have w’.nack-buf, = u.nack-buf,° id. It is now easy to see that all state 
variables of A®,’ are handled correctly. 

(b) id = s.last,. 
In this case, the An level makes no changes (that is, the only difference between s and 
s’ is that the latter has the one message deleted from the sr channel). We must thus 
show that all variables but sr of A," have the same values in wu and w’. 
First we note that the A&’ step does not choose the second alternative inside the 
nested if-then-else construct since the definition of this case and Rygq gives us that 
id = u.last,. 
We must show that A?,” does not choose the third alternative. The only way A2,’ can 
choose the third alternative is if u.mode, = idle. From the definition of Ryq we see 
that this is the case if s.mode, € {idle,accept}. Now, Invariant 9.3 gives us that 
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s.last, = nil, but this contradicts the definition of this case (td = s.last,), thus, we 
cannot have u.mode, = idle which again implies that A," does not choose the third 


alternative. 
That suffices. 


a = send_pkt ,,(done, id) 


This step of Ab changes done-buf, and may change the channel sr, but from the definition of 
Rua we see that this does not change any of the variables in A®,’, so we have u = u’. Thus, the 
finite execution fragment wu clearly has the right properties. 


a € {send_pkt,.,(ack, id, b), receive_pkt,,(ack, id, b)} 


Then it is easy to see that (u’, send_pkt,,(id,6),u) and (w’, receive_pkt,,(id,b), w), respectively, 
are steps of AQ’. 


a = receive_pkt,,.(done, id) 
We consider cases. 


1. s.mode, = accept and id = s.id,. 


There are two subcases. 


(a) s.mode, # needid or 

(s.mode, = needid and s.jd, # s.jd,) or 

(s.mode, = needid and s.jd, = s.jd, and (accept, s.jd,, s.id,) € s.rs) 

We show that (u, shrink_good,({id}), u’) € steps(Af,'). This step (and execution frag- 

ment) clearly has the right trace. 

First, we show that shrink_good,({id}) is enabled in w. 

i. s.mode, # needid 

Then the precondition of shrink_good,({id}) is satisfied by u. The only interesting 
case is if s.mode, = send. In this case we must show that u.last, 4 id, i.e., that 
s.id, # id. Since (done, id) € s.sr, Invariant 9.8 gives the result. 

ii. s.:mode, = needid and s.jd, 4 s.jd, 
Here, it suffices to show that id ¢ u.good,. From Ryq we get that u.good, = {id' | 
(accept, s.jd,, id’)}. From Invariant 9.9 Part 3 we get that u.good, is a subset of 
the set S defined as S = {id' | (s.jd,, id’) € s.seen,}, so it suffices to show that 
id € S'. Since s.id, = id # nil, we get from Invariant 9.9 Part 1 that (s.jd,, id) € 
s.seen, and Part 2 of the same invariant then implies that (s.jd,,id) ¢ s.seen, 
since s.jd, # s.jd, in the case we consider here. Thus, the result follows. 

iii. s.mode, = needid and s.jd, = s.jd, and (accept, s.jd,, s.id,) € s.rs 
Invariant 9.10 implies that this situation cannot occur. 

We now show that the variable changes are allowed by the step of AQ’. 

First, we show that good, is handled correctly. By definition of this case and Rycg, we 

get that u.good, = {id} and u’.good, = 9. Thus, good, changes in a way allowed by 

shrink_good, ({id}) in AP’. 

We must show that no other variables have different values in wu’ and wu. The interesting 

cases are mode,, last,, and good,. 
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For mode, we have s.mode, = accept and s’.mode, = idle, but then Ryq gives us 
u'.mode, = u.mode, = idle, as needed. 
For last, we have u.last, = nil from Invariant 9.3 since s.mode, = accept, and 
u' dast, = nil from the definition of the An step. Thus, last, is unchanged as needed. 
Finally, we consider good, 
i. s.mode, # needid 
Then, since also s’.mode, # needid, Ryq gives us u’.good, = u.good,(= ) as 
needed. 
ii. s.mode, = needid and s.jd, 4 s.jd, 
Since s’.mode, = needid, we have s’.jd, # nil (easy invariant), so since s’.jd, = 
nil we have s’.jd, # s'.jd,. Now, since jd, and rs are unchanged in the Ab step, 
we clearly get from Rye that u'.good, = u.good, as needed. 
iii. s.mode, = needid and s.jd, = s.jd, and (accept, s.jd,,s.id,) € s.rs 
Again, Invariant 9.10 implies that this situation cannot occur. 
s.mode, = needid, s.jd, = s.jd,, and (accept, s.jd,,s.id,) € s.rs 
We show that (u, shrink_good ,({id}), wu”, shrink_good,({id}),u’), where wu” is defined 
below, is an execution fragment of Af’ by showing that (u, shrink_good,({id}), wu’) 
and (uw, shrink_good,({id}), u’) are steps of A’. The execution fragment clearly has 


the right trace. 

Define u” to be the same as u except that u”.good, = u.good, \ {id}. 
Then obviously (u, shrink_good,({id}), wu’) € steps(Ag,’). 

We show that also (u", shrink_good,({id}), u’) € steps( Ab’). 


Since u”.mode, = u.mode, = needid and id ¢ u".good,, shrink_good,.({id}) is enabled 


in wu”. 

We show that all variables are handled correctly. 

For all other variables than good, the arguments are as in the case above. 

We show that u”.good, = u’'.good,. We have s’.jd, = nil # s’.jd, (since s’.mode, 
needid), so the definition of Rug and wu” gives us: 

u.good, = ({id' | (accept, s.jd,, id’) € s.rs} U {id}) \ {id} and 

u'.good, = {id' | (accept, s’.jd,, id’) € s' rs}. 


Since jd, and rs are unchanged, it suffices to show id ¢ {id' | (accept, s.jd,, id’)}, but 


since id = s.id,, this follows directly from the definition of this subcase. 
That suffices. 


s.mode, = ack and id = s.last,. 


We show that (u, cleanup,,u') € steps(A®,'). This step (and execution fragment) clearly 


has the right trace. 


Since (done, id) € s.sr we get from Invariant 9.8 that id # s.id,, so from the definition 
of Ryq and the hypothesis we get u.last, # u.last,. Also, since s.mode, = ack, we have 


u.mode, = ack. Thus, cleanup, is enabled in wu. 

All variables are handled correctly. The changes to last, and mode, in Ab clearly are 
required by the definition of cleanup, in A®,’. Since mode, # accept we have u.good, 
u’.good,(= ) as needed. The only other interesting case is good,. But since mode, 
accept and jd, and rs are unchanged by the step in An we get from Rye that u’.good, 
u.good, as needed. 


. Otherwise 


Then we claim that wu’ = wu. 


as 


Ih HK I 
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The only difference between s and s’ is the removal of the done packet from the sr channel. 
This does not affect any of the A®,’ variables. 


a = grow-jd-used ,(jds) 


This step adds some elements to jd-used,, but since jd-used, is not used in the mapping Rue, 
we have u = wu’. Thus, the execution fragment u has the right properties. 


a = grow-issued,,(ids) 


This transition adds elements to issued, in An, 

We show that (u, grow_good,(I), u", shrink_good,(I),u’), where wu” is defined below and J = 
s'.issued, \ s.issued,, is an execution fragment of A,’ by showing that (u, grow_good,(1),u") 
and (u’, shrink_good,(1),u’) are steps of Af... The execution fragment clearly has the right 
trace. 


Define u” to be the same as u’ except that u.good, = u.good, U I. 


From the definition of Rug we get that [ = u’ issued, \u.issued, which implies that [Nu.issued, = 
0. Thus, grow_good,(I) is enabled in u. Now, the only difference between u and wu” is that 
u" good, = u.good, UT (by explicit construction) and wu” issued, = u.issued,UT (by the definition 
of grow-issued,, Ryq and w), but this is as required by grow_good,(I) in Ae’. 


We now consider (w”, shrink_good,(I),u’). To show that shrink_good,(1) is enabled in uw’, we 
show that IN u”.good, = @ and that wu” last, ¢ I. 


First, consider the claim that IN u".good, = 0. Since u".good, = u.good, we must show that 
IQu.good, = 9. From Invariant 9.1 and Rya we get that u.good, C s.issued,, but since 
If s.issued, = ) (by the definition of J) the result follows directly. 


Then, consider the claim that wu. last, ¢ I. Since wu’ last, = u.last, = s.id,, we must show that 
sid, ¢ I. If s.id, = nil this is obvious, so assume s.id, # nil. Then Invariant 9.7 gives us that 
s.id, € s.used,, and Invariant 9.1 implies that s.id, € s.issued,. Again, since [ s.issued, = Q, 
we get the result. 


Thus, shrink_good,(1) is enabled in w”. 


The only difference between wu” and wu’ is by the definition of wu” that u’.good, = u.good, UT = 
u'.good, UI. (The latter equality uses the definitions of grow-issued, and Ryaq to see that 
u'.good,, = u.good,.). To satisfy the requirements in Af,’ we must show that u’.good, = w’’.good,. \ 
I. This is only the case if u’.good, \ I = w'.good,,, i.e., if w’.good, AI = 9. Now, either w’.good, = 9 
in which case this result follows directly or w’.good, = {s’.id,} (with s’.id, # nil). In the latter 
case we observe that s’.id, = s.id,, so Invariant 9.1 implies that u’.good, C s.issued,, and since 
I s.issued, = 0, we get that u’.good, NI = 0, as needed. 


This concludes the simulation proof. 
a 


With this simulation result we can prove that Ay safely implements Ag. 


Theorem 9.14 (Ay safely implements A,) 
Ay Es Ag 
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By Lemma 9.13 and the soundness of refinement mappings (Lemma 5.8) we get Ab’ Cy Ab, 
and from Lemma 5.14 we get Al, Cs At’. Thus, 

Ai; Es AG’ 
which by substitutivity (Lemma 2.16) implies 


Ay \ An 


Cy At’ \ An 


Then, by the definition of p, Ay, and Ag we get 


Ay \ An 


Es AQ’ \ p(Aa) 


Now, since p only renames actions which are subsequently hidden, this implies 


Aly \ An 


Cs AG \Ae 


which finally, by definition, yields the result 
Ay Es Ac 


9.4.4 Correctness 


We can now 


turn attention to formally proving that H”’ correctly implements G?’, which, in 


turn, then allows us to prove that H correctly implements G. 


We start out by giving some basic results about AX’. The first results (Lemma 9.15 and 


Lemma 9.16 


) describe certain possible steps of An in the absence of crashes. The lemmas 


have one part for each mode in the system and each part is furthermore divided into two sub- 
parts. The first subpart states that if the system reaches a certain state, then it will stay in 


that state at 


least until a certain action (or certain actions) occur(s). The second subpart then 


states the resulting state if such an action indeed occurs. 


In the re 


mainder of this section we use notation like send_pkt,,(accept,_,_) to denote the 


action function {send_pkt(accept, jd, id) | jd € JD A id € ID}. Similarly, the expression, e.g., 


send_pkt,,(a 


Lemma 9.1 


/ . 
At. satisfies 


ccept, _, id, ) denotes the action function {send_pkt(accept, jd, id,) | jd € JD}. 


5 


each of the following formulas 


1. (a) 


(G(mode, # rec) A mode, = idle => (mode, = idle W; (choose_jd(_)))) 


b) 


(mode, = idle A (choose_jd(_)) => mode? = needid) 


2. (a) Vjd : O(O(mede, # rec) A mode, = needid A jd, = jd => 


b) 


(mode, = needid A jd, = jd W; (receive_pkt,,(accept, jd, -)))) 
(mode, = needid A (receive_pkt,,(accept, jd,,-)) => mode, = send) 


3. (a) Vjd :Vid : O(O(mode, # rec) A mode, = send A jd, = jd A id, = id => 


b) 


Proof 


(mode, = send A jd, = jd A id, = id W; (receive_pkt,.,(ack, id, _)))) 
(mode, = send A (receive_pkt,.,(ack, id,,_)) => mode, = idle) 


Easy by careful inspection of the steps of An, 
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Lemma 9.16 


Ah satisfies each of the following formulas 


1. (a) O(O(moede, # rec) \ mode, = idle => 
(mode, = idle W; (receive_pkt,,.(needid, _)))) 
(b) Vjd : O( mode, = idle A (receive_pkt,,(needid, jd)) => 
mode, = accept A jd? = jd) 


2. (a) Vjd : Vid : O(O(mode, # rec) A mode, = accept A jd, = jd A id, = id => 
(mode, = accept A jd, = jd A id, = id W; 
(receive_pkt,,.(send, _,id)) V (receive_pkt ,.(done, id)))) 


b) C(mode, = accept A (receive_pkt,,.(send, _, id,)) => mode? = rcvd) 
((mode, = accept A (receive_pkt,,.(done, id, ))) => mode? = idle) 


3. (a) Vid : O(O(mode, # rec) A mode, = revd A last, = id => 
(mode, = rcvd A last, = id W; (receive_msg(_)) A buf? = €)) 


b) C(mode, = revd A (receive_msg(_)) A buf? = ¢ => mode? = ack) 


4. (a) Vid : O(O( mode, # rec) A mode, = ack A last, = id => 
(mode, = ack A last, = id W; (receive_pkt,,.(done, id)))) 


b) A( mode, = ack A (receive_pkt,,.(done, last,)) => mode; = idle) 


Proof 
Easy by careful inspection of the steps of An, 
a 


In the proofs below we furthermore need the following simple lemma. 


Lemma 9.17 


Ab! E O(mode, = needid A mode, = accept A jd, = jd, => 
a(receive_pkt,,.(send, -, id,)) \ a(receive_pkt,,.(done, id, ))) 


Proof 

Directly by Invariant 9.10. 

a 

We now turn attention to more interesting results about the live executions of H’’. The first 
lemma states that if the sender stays in needid mode, then it will issue infinitely many needid 


packets. This result is actually a simple consequence of weak fairness to the set Cy,.1. We give 
the proof in all formal detail. 


Lemma 9.18 (needid liveness) 


Lh! & Vid : O(G(mede, = needid A jd, = jd) ©(send_pkt ,.(needid, jd))) 


Proof 


i 
ASSUME: a € Lh 
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PROVE: aE Vjd: 


( 


(mode, = needid A jd, = jd) 


(1)1. AssuME: jd is arbitrary 


PROVE: @E 


( 


(mode, = needid A jd, = jd) 


©(send_pkt ,,.(needid, jd)) ) 


©(send_pkt ,,(needid, jd)) ) 


(2)1. ASSUME: aq, is an arbitrary suffix of a 
a, - O(mode, = needid A jd, = jd) 


PROVE: 


(3)1. ASSUME: a; — 
PROVE: 


©(send_pkt ,.(needid, jd)) 


(4)1. ay EF WE(CH 51) 


Proor: By the assumption a € Lh’ we havea EK WF(Cy.s1). Then 
Assumption (2) and Lemma 3.5 Part 1 give the result. 


(2)2. QED. 


(mode, = needid A jd, = jd) 
a, — OO(send_pkt ,.(needid, jd)) 


. a, F OO(mode, € {rec, needid, send} V 


(mode, = idle A buf, = ¢)) => 


(CH, s1) 


Proor: From (4)1 by expanding WF and noting that enabled(Cy,.1) = 
(mode, € {rec,needid, send} V (mode, = idle A buf, = ¢)). 


(CH, s1) 
Proor: Directly from (4)2. 


~ AF (CH, s1) 


. a, F O(mode, € {rec,needid, send} V 
(mode, = idle A buf, =<¢)) => 


Proor: By Assumption (3), (4)3, and Rule MP. 


. QED. 


Proor: By (4)4 since Assumption (3) yields that send_pkt ,,(needid, jd) 
is the only action in Cy,5; which is enabled anywhere in a. 


(3)2. Q.E.D. 
Proor: By (3)1 and the definition of implication. 


Proor: By (2)1 and Lemma 3.5 Part 2. 


(1)2. QED. 


Proor: By (1)1 and Lemma 3.5 Part 5. 


The following lemmas (Lemmas 9.19—-9.23) state similar basic results about the live executions 


of H*’. 


Lemma 9.19 (done liveness) 


(mode, # rec) A id € done-buf ,) ~ (send_pkt ,,.(done, id)) 


1. LE Vid : ( 


2. Lh & Vid : O( 


(mode, # rec) A 
©(send_pkt ,,.(done, id)) 


}(receive_pkt,.,(ack, id, true)) ==> 
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3. Lh Wid : Vid : (C(mode, = needid A jd, # jd) A 


©(receive_pkt,,(accept, jd, id)) => 
©(send_pkt ,,.(done, id))) 


Proof 
We sketch the proof. 


1. Consider an arbitrary suffix of a live execution of H*’ and assume that the sender is never 


crashed in this suffix. In the first state of the suffix, let id be an arbitrary element of 
done-buf , and id’ the first element of done-buf,. Then send_pkt,,(done, id’) is enabled 
(since O(mode, # rec)) and by fairness eventually send_pkt (done, id’) occurs and id’ is 
removed from done-buf ,. By repeating this argument, we get that eventually cd is first on 
done-buf , and then eventually send_pkt ,.(done, id) occurs. 


. Here id will infinitely often be put into done-buf , by the receive_pht,,(ack, id, true) events 


since O(mode, # rec). Then Part 1 of this lemma implies the result. 


. Similar to Part 2. When mode, = needid, Invariant 9.6 implies id, = nil. Then, 


since jd, # jd, the each receive_pkt,.,(accept, jd, id) step leads to id being inserted into 
done-buf ,. Part 1 of this lemma then implies the result. 


Lemma 9.20 (accept liveness) 


2. Lh EB Vid : Vid : O( 


1. LB! Vid : Vid : 


(G(mode, = accept A jd, = jd A id, = id) ©(send_pkt,,(accept, jd, id))) 


(mode, # rec) A mode, = accept A jd, = jd A id, = id => 
©(receive_pkt,,.(send, _, id)) V 
© (receive_pkt,,.(done, id)) V 
©(send_pkt,,(accept, jd, id))) 


sr( 
( 


rs 


Proof 


1. Similar to the proof of Lemma 9.18. 


2. AssuME: l. a € Lh 


2. jd and id are arbitrary 
3. a, is an arbitrary suffix of a 
PROVE: a, F O(mode, # rec) \ mode, = accept A jd, = jd A id, = id => 
©(receive_pkt,,.(send, _, id)) V 
©(receive_pkt,,.(done, id)) V 
©(send_pkt,.,(accept, jd, id)) 


(1)1. a; E O(mode, = accept A jd, = jd A id, = id) => O(send_pkt,,(accept, jd, id)) 


Proof: From Part 1 of this lemma, the Assumptions, and Lemma 3.5. 


(1)2. a; F O(O( mode, # rec) A mode, = accept A jd, = jd A id, = id = 
((mode, = accept A jd, = jd A id, = id) W; 
((receive_pkt,,(send, _, id)) V (receive_pkt,,.(done, id))))) 
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(1)3. Q.E.D. 
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Proor: By Lemma 9.16 Part 2(a), The Assumptions, and Lemma 3.5. 


Proor: By (1)1, (1)2, and Rule Unll. 


By Lemma 3.5 the result follows. 


Lemma 9.21 (rcvd ~ ack) 


Ly FOL 


Proof 


(mode, # rec) => (mode, = rcvd ~~ mode, = ack)) 


We only sketch this proof. During any live execution of Ht’, if the receiver is in rcvd mode 
and never crashes, then, by the definition of steps( Ah’), the only mode change of the receiver 
is a mode change to ack in a receive_msg(m) step that empties buf,. Furthermore, when 
mode, = rcvd no messages can be put into buf, (which actually implies that buf, will always 
contain zero or one element). Then, by fairness to recetve_msg(m) steps, buf, will eventually 
be emptied and hence the result follows. 


Lemma 9.22 (ack liveness) 


1. Lh! 


2, Lh! 


Proof 


LE Vid : 


LE Vid : 


( 


(mode, = ack A last, = id) ©(send_pkt,.,(ack, id, true))) 


( 


(mode, # rec) \ mode, = ack A last, = id => 


©(receive_pkt ,.(done, id)) V OO(send_pkt,,(ack, id, true))) 


Similar to the proof of Lemma 9.20. 


Lemma 9.23 (ack ~> idle) 


Ly FO 


Proof 


(mode, # rec A mode, # rec) => (mode, = ack ~+ mode, = idle)) 


By Lemma 3.5 the following proof suffices. 


AssuME: 1. a € Lh 


PROVE: 


2. a, is an arbitrary suffix of a 
3. id is arbitrary 


4, Qy 


(mode, # rec A mode, # rec) 


a, K mode, = ack ~ mode, = idle 


(1)L. ay 


(mode, # rec) \ mode, = ack A last, = id => 
©(receive_pkt ,.(done, id)) V OO(send_pkt,,(ack, id, true ))) 


Proor:By Lemma 9.22 Part 2, the Assumptions, and Lemma 3.5. 


9.4. Correctness of H 179 


(1)2. a; — O(O( mode, # rec) A mode, = ack A last, = id => 
©(receive_pkt,,.(done, id)) V OO(receive_pkt,.,(ack, id, true))) 


Proor: By (1)1 and Channel Liveness (Qc). 


(1)3. a; —& O(O(mode, # rec) A mode, = ack A last, = id => 
(receive _pkt,,.(done, id)) V OO(receive_pkt,,.(done, id))) 


Proor:By (1)2, Lemma 9.19 Part 2, Rule MP, and Channel Liveness (Qch,s,)- 


(1)4. a, —& O(O(mode, # rec) A mode, = ack A last, = id => 
© (receive _pkt,,.(done, id ))) 


Proof: Directly from (1)3. 


(1)5. a; — O(O( mode, # rec) A mode, = ack A last, = id => 
((mode, = ack A last, = id) U; O(receive_pkt,,.(done, id)))) 


Proor: By (1)4, Lemma 9.16 Part 4(a), and the definition of U;. 


(1)6. a; — O(mode, = ack A last, = id => 
(mode, = ack A last, = id A (receive_pkt,,.(done, id)))) 


Proor: By (1)5, The Assumptions, Rule MP, and the definition of U;. 


(1)7. a, E mode, = ack A last, = id ~ 
mode, = ack ( last, = id A (receive_pkt (done, id)) 


Proor: Directly from (1)6 and the definition of ~. 


(1)8. a, & (mode, = ack A last, = id) ~+ mode, = idle 
Proor: By (1)7, the ~ property implied by Lemma 9.16 Part 4(b), and transitivity of 


~, 


(1)9. Q.E.D. 
Proor: Directly from (1)8. 


We are now ready to state and prove a very important result about the live executions of H”’. In 
Section 9.2.3 we provided some intuitive justification of the mode of operation of the H protocol. 
One bad situation that we touched upon was when the sender is in needid mode but the receiver 
is in some “bad” mode other than idle. We argued that eventually, due to done packets, the 
receiver would always be reset to idle but that it immediately could enter a bad accept mode 
again as a result of receiving an old needid packet (i.e., a needid packet (needid, jd) for which 
jd # jd,) from the channel. However, since each channel step can only add a finite number of 
packets to a channel, at any point during execution there are only finitely many packets—and 
consequently only finitely many old needid packets—in the sr channel. Therefore, since the 
sender only adds new needid packets to sr, the receiver can only enter a bad accept state finitely 
many times. Thus, sooner or later either the receiver receives a new needid packet (even though 
there are still old ones in the channel) or all old needid packets have been received, in which 
case the receiver will eventually be reset to idle mode and thereafter receive a new needid 
packet. This is formalized in the following lemma. In the proof we use the induction rule Ind. 

First, we need the following definition: in any state where mode, = needid, define the num- 
ber of old needid packets, written #,;qneedid, to be the number of needid packets (including 
duplicates) in the sr channel with jd # jd,. 
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Lemma 9.24 


Lh E Vjd : O(O(mode, = needid A jd, = jd \ mode, # rec) => 
(mode, = accept A jd, = jd)) 


Proof 

AssuME: a € Lh! 

PRovE: a Vjd : O(O(mode, = needid A jd, = jd \ mode, # rec) => 
(mode, = accept A jd, = jd)) 


(1)1. Assume: 1. jd is arbitrary 
2. a, is an arbitrary suffix of a 
3. a, F O(mode, = needid A jd, = jd \ mode, # rec) 
PROVE: a; — O(mode, = accept A jd, = jd) 


(2)1. CASE: a; F mode, = accept A jd, = jd 
(3)1. Q.E.D. 


Proor: Case Assumption (2) implies the goal. 


(2)2. CASE: a; F 7(mode, = accept A jd, = jd) 
(3) 1. a, E O( mode, = idle) 
(4)1. CASE: a, — mode, = idle 
(5) 1. Q.E.D. 


Proor: Assumption (4) implies the goal. 


(4)2. CASE: a, — mode, = ack 
(5)1. Q.E.D. 
Proor: By Assumptions (4) and (1).3, and Lemma 9.23. 


(4)3. CASE: a, — mode, = rcvd 
(5)1. Q.E.D. 
Proor: By Assumptions (4) and (1).3, and Lemmas 9.21 and 9.23. 
(4)4. CASE: a, — mode, = accept A jd, F jd 


(5)1. a, K mode, = accept A jd, 4 jd A jd, = jd’ A id, = id 
Proor: From Assumption (4) by letting jd’ and id be the values 
of jd, and id,, respectively, in the first state of ay. 


(5)2. ay F O(receive_pkt,,.(send, -, id)) V O(receive_pkt,,,(done, id)) V 
©(send_pkt,,(accept, jd’, id)) 


Proor: By Lemma 9.20 Part 2, Lemma 3.5, (5)1, Assumption 
(1).3, and Rule MP. 
(5)3. ay E O(receive_pkt,,(send, _,id)) V ©(receive_pkt,,.(done, id)) V 
© (receive _pkt (done, id)) 


Proor: By (5)2, Channel Liveness (Qcn,sr and Qcn,rs), Lemma 
9.19 Part 3, the Assumptions, Lemma 3.5, and Rule 3.5. 
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(5)4. ay F O(receive_pkt,,.(send, -, id)) V O(receive_pkt,,.(done, id)) 
Proof: Directly by (5)3. 
(5)5. ay & mode, = accept A jd, = jd’ A id, = id U, 
(receive_pkt,,.(send, _,id)) V (receive _pkt,,.(done, id)) 
Proor: By (5)4, Lemma 9.16 Part 2(a), Lemma 3.5, the Assump- 
tions, and Rule MP. 


(5)6. a 


E (mode, = accept A jd, = jd’ A id, = id A 
(receive_pkt (send, _, id))) V 
©(mode, = accept A jd, = jd’ A id, = id A 
(receive_pkt ,,.(done, id))) 


Proor: Implied by (5)5. 
(5)7. ay FE O(mode, = revd) V O( mode, = idle) 


Proor: By (5)6, Lemma 9.16 Part 2(b), the Assumptions, Lemma 
3.5, and Rule MP. 


(5)8. Q.E.D. 
Proor: By (5)7, Lemmas 9.21 and 9.23, and the Assumptions. 
(4)5. Q.E.D. 
Proor: By Assumption (2) and the exhaustive cases (4)1—(4)4. 


an 


. a, F O(#oianeedid®? < #,)qneedid) 


Proor: By Assumption (1).3, # ;aneedid is defined in all states of a, and 
jd, does not change in a,. Then, since the only actions that can add needid 
packets to sr add packets with jd # jd,, the result follows. 


. Base Case 


a, F (mode, = idle A #.iqneedid = 0) ~ (mode, = accept A jd, = jd) 


(4)1. ASSUME: 1. ag is an arbitrary suffix of a; 
2. a — mode, = idle A #,;qgneedid = 0 
PROVE: a2 F O(mode, = accept A jd, = jd) 


(5)1. ao & O(#oianeedid = 0) 
Proor: By (3)2 and Assumption (4).2. 
(5)2. ag -K On({receive_pkt,,.(needid, jd’) | jd’ # jd}) 


Proor: By (5)1, Assumption (1).2, Lemma 3.5 Part 1, and the 
definition of the steps of An, 


(5)3. ay — mode, = idle W; (receive_pkt,,.(needid, -)) 


Proor: From Lemma 3.5 Part 1, the fact that a» is a suffix of 
a (Assumptions (1).2 and (4).1), Lemma 9.16 Part l(a), Assump- 
tions (1).3 and (4).2, and Rule MP. 


(5)4. ao E mode, = idle W; (receive_pkt,,,(needid, jd)) 
Proor: By (5)2 and (5)3. 
(5)5. ag F O(receive_pkt,,.(needid, jd)) 
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Proor: From Lemma 9.18, Channel Liveness Qcn,s-, Assump- 
tion (1).3, and Rule MP. 


. a) F mode, = idle Ui; (receive_pkt,,.(needid, jd)) 


Proor: By (5)4, (5)5, and the definition of U;. 


. A FE O(mode, = idle A (receive_pkt,,.(needid, jd))) 


Proor: By (5)6 and the definition of 2;. 


. QED. 


Proor: By (5)7, Lemma 9.16 Part l(b), and MP1 (and, as always, 
Lemma 3.5 Part 1 and the assumption that a» is a suffix of a). 


(4)2. Q.E.D. 


Proor: (3)3, the definition of implication, and Lemma 3.5 Part 2 gives 


ay 


(mode, = idle A #,)qaneedid = 0 => O(mode, = accept A 


jd, = jd)) which, by definition of ~-, immediately gives the result. 
(3)4. Inductive Case 


As (l< kA 


(mode, = idle A #.iqneedid = k~ 
(mode, = idle A #.,,qneedid = 1) V 
(mode, = accept A jd, = jd)))) 


(4)1. AssuME: 1. k is an arbitrary positive number 


2. @ is an arbitrary suffix of a, 
3. Q — mode, = idle A #,;qgneedid=hk 


PROVE: a2 F O((mode, = idle A #.)qgneedid < k) V 


(mode, = idle A jd, = jd)) 


(5) 1. ag E mode, = idle W; 


((receive_pkt ,.(needid, jd)) V 
({receive_pkt,,.(needid, jd’) | jd’ # jd})) 


Proor: By Lemma 9.16 Part l(a), Assumptions (1).3 and (4).3, 
and Rule MP. 


. A F O(receive_pkt ,,.(needid, jd)) 


Proor: By Lemma 9.18, Assumption (1).3, Rule MP, and Chan- 
nel Liveness Qch,sr- 


. Qo -E mode, = idle U,; 


((receive_pkt ,.(needid, jd)) V 
({receive_pkt,,.(needid, jd’) | jd’ # jd})) 


Proor: By (5)1, (5)2, and the definition of U4;. 


. Ay F O(mode, = idle A (receive_pkt,,,(needid, jd))) V 


©(mode, = idle A ({receive_pkt,,.(needid, jd’) | jd' A jd}) A 
# .qneedid < k) 


Proor: By (5)3, the definition of U%;, Assumption (4).3, and (3)2. 
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(5)5. ao F O(mode, = accept A jd, = jd) V 
(mode, = accept A jd, # jd \ #oianeedid < k) 
Proor: By (5)4, Lemma 9.16 Part 1(b) and the fact that receiving 
an old needid packet reduces #,;gneedid by one. 
(5)6. O( mode, = accept A jd, = jd) V 
(mode, = idle A #.,:qneedid < k) 


Proor: Similar to Case a; - (mode, = accept A jd, # jd) of 
(3)1 above (and (3)2). 


(5)7. Q.E.D. 
Proor: Directly from (5)6. 
(4)2. Q.E.D. 


Proor: From (4)1, The definition of ~+, and Lemma 3.5. 


(3)5. a, E Vn : O( mode, = idle A #.iqneedid = n => 
(mode, = accept A jd, = jd)) 
Proor: By (3)3, (3)4, Rule Ind, and the definition of ~. 
(3)6. For some number n’, 
ay F O(mode, = idle A #.iqneedid = n’) 
Proor: Directly from (3)1 when we let n’ be the value of #,)gneedid in some 
state of a, where mode, = idle. 
(3)7. a, E O(moede, = idle A #.;qneedid = n' => 
(mode, = accept A jd, = jd)) 
Proor: By (3)5 and Lemma 3.5 Part 6. 
(3)8. Q.E.D. 
Proor: By (3)6, (3)7, and Rule MP1. 
(2)3. Q.E.D. 


(1)2. Q.E.D. 
Proor: By (1)1 using the definition of implication and Lemma 3.5 Parts 2 and 5. 


Proor: By the exhaustive cases (2)1 and (2)2. 


Now, since the receiver will eventually enter accept mode with the right jd,, eventually the 
sender will receive a (accept, jd,,id) packet as formalized by the following lemma. 


Lemma 9.25 


Dh Wid : 


Proof 


( 


(mode, = needid A jd, = jd \ mode, # rec) => 


© (receive_pkt,.,(accept, jd, _))) 


i 
ASSUME: a € LR 


PROVE: a 


FE Vid : O(O( mode, = needid A jd, = jd A mode, # rec) => 
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©((receive_pkt,.,(accept, jd, _)))) 


(1)1. Assume: 1. jd is arbitrary 


PROVE: 


2. a, is an arbitrary suffix of a 
3. a, F O(mode, = needid A jd, = jd \ mode, # rec) 
ay F O(receive_pkt,,(accept, jd, _)) 


E (mode, = accept A jd, = jd) 


Proor: By Lemma 9.24, Assumption (1), Lemma 3.5, and Rule MP. 


(2)2. ASSUME: 1. ay is a suffix of a, such that 


2. A, F mode, = accept A jd, = jd A id, = id 


PROVE: a2  O(receive_pkt,,(accept, jd, _)) 
(3)1. as & (mode, = accept A jd, = jd A id, = id) W; 


((receive_pkt,,(send, _, id)) V (receive_pkt,,.(done, id))) 
Proor: By Lemma 9.16 Part 2(a), Lemma 3.5, Assumptions (1) and (2), and 


Rule MP. 
(3)2. a2 — O(mode, = accept A jd, = jd A id, = id) 
Proor: By (3)1, Lemma 9.17, Lemma 3.5, and Rule Unl. 
(3)3. a2 F OO(send_pkt,,(accept, jd, id)) 
Proor: By (3)2, Lemma 9.20 Part 1, Lemma 3.5, and Rule MP. 
(3)4. ao F O0O(receive_pkt,,(accept, jd, id)) 
Proor: The form of Qcy,;s implies that since a FE Qcn,rs (@ is live) and az is 
a suffix of a, then a2 F Qcn,s- This and (3)3 together with Rule MP give 
the result. 
(3)5. Q.E.D. 
Proof: Directly from (3)4. 
(2)3. Q.E.D. 
Proor: By (2)1 and (2)2. 
(1)2. Q.E.D. 
Proor: By (1)1, the definition of implication, and Lemma 3.5. 
| 
Lemma 9.26 
Ah’  O(O(mode, = needid A mode, # rec) => (mode, = send)) 
Proof 


Directly from Lemma 9.25 and Lemma 9.15 Part 2(b). 


We are now ready to prove the main part of the liveness proof that H”’ correctly implements 
G?’, namely, if @ is a live execution of H”’ and a’ is an execution of G?’ such that (a,a’) € Rua, 
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then a’ is live. As usual, we prove this result by contradiction. Thus, we assume that a’ is not 


live and then derive a contradiction with the fact that 


Lemma 9.27 


a is live. 


Leta eé exec( Al’) and a! € exec(A®,’) be arbitrary executions of An and A’,’, respectively, with 


(a,a’) € Ryg. Assume a — Qy. Then a’ & p(Qa). 
Proof 

We prove the conjecture by contradiction. Thus, 
ASSUME: a’  p(Qa) 

PROVE: False 

(La! AWE(p(Ca.sjon)) V 


3 


Ca,s/r3)) V 
Ca,sjra)) V 
©(send_pkt.,.(p)) 
: WF(receive_pkt,.(p)) V 
aVp : (A(send_pkt,.,(p)) 
aVp: WF (receive_pkt,.(p)) 


J 
5 
as 


ode, = needid \ mode, # rec) => O(p(Cas/r2))) V 


©(receive_pkt,,.(p))) V 


©(receive_pkt,.,(p))) V 


Proor: Immediate by the Assumption, definition of p(Qq), and the Boolean operators. 


. CASE: a’ 


L 3 WF(p(Ca srt) 


(2)1. a’ 


F ©O(mode, € {idle, send,rec}) A © 


=(p(Ca,s/ri)) 


Proor: From Case Hypothesis (1) by noting that enabled(p(Ca,s/r1)) = (mode, € 


{idle, send, rec}) and by expanding WF. 


(mode, € {idle,send,rec}) A © 


(mode, € {idle, send, rec}) A 
a(p(Ca,s/r1) \ {prepare}) A 


Proor: By (2)2 there is a suffix of a where 


(p(Ca,s/ri) \ {prepare }) 


: From (2)1 by definition of Rug and by Lemmas 5.10 and 5.11. 


a({send_pkt,,(needid, jd) | jd € JD}) 


always mode, € {idle, rec, send}. Thus 


we get that no send_pkt ,.(needid,_) actions occur in that suffix, since such actions 


are only enabled when mode, = needid. 
E © 
© 
Proor: By (2)3 by noting that if mode, 
bigger set {idle, send, rec, needid}. 


-—- 1 WF (Cy,51) 


- & 


- & 


(mode, € {idle, send, rec,needid}) A 
a((p(Cas/n) \ {prepare}) U {send_pkt,,(needid, jd) | jd € JD}) 


is in {idle,send,rec}, it is also in the 


Proor: From (2)4 by using the definitions of WF and Cy 51. 


. QED. 


Proof: (2)5 contradicts the assumption that a 


Qu. 
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(1)3. CASE: a’ —- AO(O(moede, = needid \ mode, 4 rec) => O(p(CG_s/r2))) 
(2)1. a’ — O(O(mede, = needid A mode, # rec) A O7(p(Ces/r2))) 


Proor: Directly from Assumption (1). 

(2)2. a! — OO(mede, = needid \ mode, # rec) A OOA7(p(C6 s/r2)) 
Proor: Directly from (2)1. 

(2)3. a — OO( mode, = needid A mode, # rec) 
Proor: From (2)2 by Lemma 5.11 and the definition of Rua. 


(2)4. There exists a suffix a, of a such that 
a, - O(mode, = needid A mode, # rec) 


Proor: From (2)3 using Lemma 3.5 Part 3. 


(2)5. a, F O(mode, = needid \ mode, # rec) => O( mode, = send) 
Proor: By Lemma 9.26, Lemma 3.5 Part 1, and Rule Par. 
(2)6. a, F O(mode, = send) 
Proor: By (2)4, (2)5, and Rule MP. 
(2)7. Q.E.D. 
PROOF: (2)6 contradicts (2)4. 
(1)4. Case: a! F AWF(p(Ces/r3)) 
(2)1. a’ E OO(mede, = rec V (mode, = rcvd A buf, # €) V mode, = ack) A 
O07(p(Ca,s/r3)) 
Proor: By Assumption (1) and the definitions of WF and enabled(p(Cg_s/,3)). 
(2)2. a —- OO( mode, = rec V (mode, = revd A buf, # €) V mode, = ack) A 
O0>(p(Ca,s/r3)) 


Proor: From (2)1 by definition of Ryag, the fact that p(Ce@,s/,3) contains external 
actions only, and Lemmas 5.10 and 5.11. 

(2)3. a - OO( mode, = rec V (mode, = revd A buf, # €) V mode, = ack) A 

O07 (p(Ca,s/ra)) A 

OOA7({send_pkt,,(accept, jd, id) | jd € JD A id € ID}) 


PROOF: Since, by (2)2, there is a suffix of a where always mode, € {rec, rcvd, ack} 
we get that no send_pkt,,(accept, -,_) actions occur in that suffix, since such actions 


are only enabled when mode, = accept. 


(2)4. a —& OO((mode, = revd A buf, # €) V mode, € {rec, ack, accept}) A 
©O7(p(Cg 5/13) U {send_pkt,.,(accept, jd, id) | jd € JD A id € ID}) 


Proor: By (2)3 by noting that if eventually mode, is always in {rec,rcvd, ack}, 
then it is eventually always in the bigger set {rec,rcvd, ack, accept}. 


(2)5. a EF AWP(Cy 1) 


Proor: By (2)4 using the definition of WF and the fact that Cy1 = p(Ca,s/r3) U 
{send_pkt .,(accept, jd, id) | jd € JD A id € ID}. 


(2)6. Q.E.D. 
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Proof: (2)5 contradicts the assumption that a F Qy. 


(1)5. CASE: a! E AWF(p(C@_s/ra)) 


(2)1. 


a’ & OU(mode, # rec A nack-buf,, # ©) N OO7(p(Ca,s/ra)) 


Proor: From Assumption (1) by using the definition of WF’, and the fact that 
enabled(p(CG s/r4)) = (mode, # rec A nack-buf, F €). 


. a OU(mode, # rec A nack-buf,, £ €) A OO7(p(Ce s/r4)) 


Proor: By (2)1, the definition of Rua, the fact that p(Ca,s/-4) consists of external 
actions only, and Lemmas 5.10 and 5.11. 


~_ae AWE (Cyr) 


Proor: By (2)2 using the definition of WF and the fact that Cy.» = p(Ca,s/ra): 


. QED. 


Proof: (2)3 contradicts the assumption that a F Qy. 


(1)6. Case: a’ — AVp : (OO(send_pkt ,,.(p)) (receive _pkt ,..(p))) 


(2)1. 


a’ & dp: (OO(send_pkt ,.(p)) A OO7(receive_pkt,,.(p))) 


Proor: Directly from Assumption (1). 


. There exists m € Msg and id € ID such that 


a’ — O00(send_pkt ,,.(send, m, id)) A OOn7(receive_pkt,,.(send, m, id)) 
Proor: By (2)1 and Lemma 3.5 Part 8. 


. a F OO(send_pkt,,.(send, m, id)) A OO-7(receive_pkt,,(send, m, id)) 


Proor: By (2)2, Lemma 5.10, and the fact that the actions send_pkt (send, m, id) 
and receive_pkt,.(send, m, id) are external. 


. a - dp: (OO(send_pkt,,.(p)) A OOn7(receive_pkt,,.(p))) 


Proor: By (2)3 and Lemma 3.5 Part 7. (Note that the bound variable p ranges 
over all packets of the form (needid, id), (send, m, id), and (done, id), whereas the 
bound variable in (2)1 only ranges over packets of the form (send, m, id).) 


.aF AVp: (OO(send_pkt,,.(p)) ©(receive_pkt ,.(p))) 


Proor: Directly from (2)4. 


. QED. 


Proor: (2)5 contradicts the assumption that a F Qy. 


(1)7. CASE: af EF AVp: WE (receive_pkt,,.(p)) 


(2)1. 


(2)2. 


(2)3. 


a’ & dp: 5 WF (receive_pkt,,.(p)) 


Proor: Directly from Assumption (1). 


For some packet p (of the form (send, m, id)), 
a’ — OOn(receive_pkt,.(p)) A OU(p € sr) 


Proor: By (2)1, Lemma 3.5 Part 8, the definition of WF and since receive_pkt,,.(p) 
is enabled when p € sr. 


a — OOA7(receive_pkt,,.(p)) \ OU(p € sr) 
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Proor: By (2)2, Lemmas 5.10 and 5.11, and the facts that receive_pkt,,.(p) is exter- 
nal, and if (s,u) € Ryg and u — (p € sr), then s — (p € sr) (recall that p has the 
form (send, m, id)). 

(2)4. a & ~Vp: WE (receive_pkt,,.(p)) 
Proof: Directly from (2)3, Lemma 3.5 Part 7 and the definition of WF. 


(2)5. Q.E.D. 
Proor: (2)4 contradicts the assumption that a F Qy. 
(1)8. CAsE: a! — AVp : (OO(send_pkt,.,(p)) ©(receive_pkt,.,(p))) 


Proof: Similar to (1)6. 
(1)9. Case: af E AVp: WF (receive_pkt,,(p)) 
Proof: Similar to (1)7. 
(1)10. Q.E.D. 
Proor: By (1)1 and the exhaustive cases (1)2—(1)9. 


With this result, the simulation result of the previous section, and Lemma 5.9 we can prove that 
H” correctly implements G*". 


Lemma 9.28 


, 


HH’ Cy, Ge’ 


Proof 
Immediate by Lemmas 9.13, 9.27, and 5.9. 
a 


And, finally, we can prove that H correctly implements G. 


Theorem 9.29 
HO,G 


Proof 


By Lemma 9.28 and Lemma 5.15 we get 
H’ C,;, Ge’ 
which by substitutivity (Lemma 2.16) implies 
H’ \ Ap Ey G*’ \ Au 
Then, by the definition of p, Ay, and Ag we get 
H’ \ Aq Ex G* \ p(Ag) 
Now, since p only renames actions which are subsequently hidden, this implies 
H’ \ An Ey G’ \ Ac 
which finally, by definition, yields the result 
Hy, G 
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Due to the fact that the correct implementation relation Ey, is a preorder, we get the overall 
result that H correctly implements $ and thus solves the at-most-once message delivery problem. 


Theorem 9.30 
HC, 5S 


Proof 


By Theorems 7.18, 8.19, and 9.29, and the fact that the subset relation, and thus the correct 
implementation relation (cf. Definition 2.15), is transitive. 


We now move to the timed setting to consider the Clock-Based Protocol C. 


Chapter 10 


The Clock-Based Protocol C 


The second and last low-level protocol we consider in this work is the Clock-Based Protocol of 
[LSW91], which in this work is denoted by C. As the name suggests the functionality of the 
protocol depends on the sender and receiver having access to certain clocks. Specifically, the 
sender and the receiver each has a local clock which is required to deviate from real time by at 
most some constant amount, called the clock skew. The C protocol thus consists of a sender, a 
receiver, two channels, and a special clock subsystem that guarantees that the local clocks are 
almost synchronized with real time. This structure is depicted in Figure 10.1. We model the 
clock subsystem as a live timed I/O automaton that issues ticks to the sender and the receiver. 
Exactly how to implement a clock subsystem in a distributed system falls outside the scope of 
this work [LMS85]. 

C is a timed protocol. Besides having the clock subsystem, we shall assume that channel 
delays and the maximum time difference between certain process steps are bounded. Thus, each 
component of C is specified as a live timed I/O automaton, and consequently C itself is a live 
timed I/O automaton. 

The specification 5 is modeled as an (untimed) live I/O automaton since the problem state- 
ment did not mention time at all. In Section 2.3 we discussed what it means to implement 
an untimed specification by a timed implementation. The idea was to to consider the untimed 
specification as a timed system that allows tome to pass arbitrarily as long as possible liveness 
assumptions are satisfied. For this reason the operator patient on safe and live I/O automata 
was introduced. 

We could have removed all liveness assumptions from C and used timing assumptions instead. 
However, then it would have been difficult to see which timing requirements were actually needed 
to guarantee the correctness of C and which were just additional timing requirements. Thus, 
we introduce the minimum timing requirements and otherwise use liveness to guarantee the 
progress of the system. This means that all external actions of C, which are subject to liveness 
requirements in 5, will be given liveness requirements in C, whereas certain internal actions, 
like channel communication, will be given timing requirements. With this approach we cannot, 
of course, prove any maximum response time on, e.g., acknowledgements ack(b) but if such a 
response time is important, it should have been specified in 5. Instead $ just assumes that the 
final implementation is “fast enough”. 


The rest of the chapter is organized as follows. First, in Section 10.1, we present the clock 
subsystem. In Section 10.2 we specify timed versions of the channels. Then, in Section 10.3, we 
specify the sender and receiver and furthermore intuitively describe how the C protocol works. 
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send_msg(m) 


receive_msg(m) 


ack(b) 


send _pkt receive_pkt,,. 
PRY (P) Channel Chi, Ptr (P) 
Sender C; : kt ( ) dpkt ( ) Receiver C, 
receive _. send_, 
paral? Channel Ch‘, pel? 


Clock 


TECOVET gs recovery 


Subsystem 
Cl 


Figure 10.1 
The Clock-Based Protocol C. 


Section 10.4 shows how C is obtained from its subprocesses and Section 10.5 then considers the 
correctness of C. Section 10.6 discusses a “weak” version of C, where the timing assumptions 
are removed, and finally Section 10.7 considers a version of C that works for a single receiver 
but multiple senders. 


10.1 The Clock Subsystem 


The clock subsystem is specified as a live timed I/O automaton Cl = (Ac, Lc). We use the 
explicit specification style (cf. Section 4.2.1) to specify Ac, and specify Lc by an environment- 
free timed liveness formula Qe for Ac. 


10.1.1 States and Start States 


Ac contains three state variables: now is as usual real time (ranging over T which equals the 
nonnegative real number), and ctime, and ctime, remember the last clock value sent to the 
sender and receiver, respectively. 


Tnitially 
FE 
retime, | T____|0______| Fast clock value sent To the sender. 


Petime, | T____ [0 ____| fast clock value sent to the receiver: 


10.1.2 Actions 


Input: 
none 
Output: 
tick (t), t © T 
tick,(t), t ET 
Internal: 
none 
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‘Time-passage: 
v 


10.1.3 Steps 


The clock subsystem is responsible just for performing outputs of the form tick,(¢) and tick,(t). 
This clock subsystem is constrained to produce ticks that have the property that, at any real 
time now, the most recent tick at either station has value within ¢ of now. Thus, €, which is 
positive, denotes the clock skew. In addition, each local clock is nondecreasing, that is, successive 
tick,(t) events have nondecreasing values of ¢, and similarly for successive tick,(t) events. 


tick .(t) vy (time-passage) 
Precondition: Precondition: 
climes <tA now <taA 
|é— now| <e |ctimes —t| <eA 
Effect: |ctime, — t] <e 
ctime, :=t Effect: 
now :=t 
tick ,(t) 
Precondition: 
ctimer <tA 
|é— now| <e 
Effect: 


ctimer :=t 


It is easy to see that Ac is in fact a safe timed I/O automaton, i.e., that is satisfies the five 
axioms in Definition 2.17. Clearly $1 is satisfied and since the tick,(t) and tick,(t) do not change 
the value of now, also S2 is satisfied. S3 is satisfied since the first conjunct in the precondition 
of the step rule for v explicitly requires real time to increase in time-passage steps. Also clearly, 
if (s,v,s’) and (s’,v, 5) are steps, then (s,v, 5’) is a steps, so $4 is satisfied. For the trajectory 
theorem 85, assume that (s,v,s’) isastep. Then s.ctime, = s’.ctime, and s.ctime, = s'.ctime,. 
So, the mapping from the interval [s.now, s’.now] to states, which to each time t returns the 
state [now + t, ctime, + s.ctime,, ctime, — s.ctime,] is a trajectory from s to s’. 


10.1.4 Liveness 


We need no liveness restriction (other that normal admissibility). Thus, 2c should consist of 
all admissible timed executions of Aq. This is specified by an environment-free timed liveness 
formula Qc for Ac as follows. 


Qa = true 


It is easy to see that true actually induces the liveness condition consisting of all admissible 
timed executions of Ac). However, generally it is not the case that true is an environment-free 
timed liveness formula for a safe timed I/O automaton. However, for the clock subsystem it is 
the case. The proof obligation is to show that there exists a (timed) strategy defined on Aq 
such that any outcome of the strategy can only consist of admissible and Zeno-tolerant timed 
executions. But this is clearly the case. First of all the clock subsystem has no inputs. So, 
the f function of the strategy should simply be defined to provide one tick,(t) step and one 
tick,(t) step every € time units (remember that € is positive). Then any outcome will consist of 
admissible timed executions only. 
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10.2 The Timed Channels 


The channels we use to connect the sender and the receiver in C are basically the same as the 
channels we used in G and H. That is, an attempt to send a packet on a channel leads to zero 
or more copies (a finite number) of the packet being put into the channel. The channels we used 
in G and H furthermore had some liveness restrictions: if we made infinitely many attempts to 
send a packet, then infinitely many copies would get through. 

Now, the C protocol needs certain timing assumptions about the channels. Not only should 
the channel delay—once a packet has been successfully placed in the channel—be bounded; it is 
also necessary to assume an upper bound on the number of attempts needed before a packet has 
been successfully placed in the channel. Thus, the timed channels should satisfy the following 
properties. 


1. For each packet p,, if k attempts (for some positive channel retry number k) are made to 
send p,, then at least one copy of p, is put in the channel—even though the & attempts 
may be interspersed with attempts to send other packets po. 


2. When a copy of a packet is successfully put in the channel, the copy will be delivered at 
the other end of the channel after at most the positive channel delay time d. 


We give an explicit specification of the timed channel Chi, = (Acn:,, Lcnt,). The specification 
of the other channel Chj, = (Acnt,, Dons, ) is similar (and obtained by replacing sr with rs). 


10.2.1 States and Start States 


The timed channel needs, as usual, a now variable to specify real time. As before the main state 
variable is a multiset sr. However, in order to specify that each packet must leave the channel 
at most time d after it entered the channel, we need to mark each packet with a send time (not 
to be confused with the identifier timestamp we associate with messages). Thus, the multiset 
contains elements of the form (p,t), where p is a packet and t is the real time when p entered 
the channel. Furthermore, to specify that after at most & attempts to send a packet, the packet 
has been successfully put into the channel, we have for each packet p a variable count,,(p) which 
counts the number of unsuccessful attempts to send p. 


ayes ey 


(O8=©)0—<“—t~si‘W Real time sss—i‘—sSsisSY time 


ACP x ——— CEE SE ESE TEES EET multiset of packets together with the time 
when the packets were sent. 


count «-(p For each p € P, count,,(p) contains the 
number of unsuccessful wee to send p 
since last successful attempt. 


Define packets(sr) to be the multiset of packets in sr, i.e., the multiset obtained by removing 
all send times t’ from all elements (p, ¢’) in sr. 
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10.2.2 Actions 


Input: 

send_pkt,,(p), p € P 
Output: 

receive_pkt.,.(p), p € P 
Internal: 

none 
‘Time-passage: 

V 


10.2.3 Steps 


send_pkt,, (p) receive_pkt ,,.(p) 
Effect: Precondition: 
let ps be a finite multiset of (p, now) such that (p,t) € sr 
ps £0 if counts,(p) =k-1 Effect: 
sr i= srU ps sr := sr \ {(p,t)} 


if ps 4 then 
counts(p) := 0 
else 
countsr(p) := countsr(p) +1 


y (time-passage) 
Precondition: 
t> now A 
Vip,t’) € sr: (t<t' +a) 
Effect: 


now :=t 


Note, that the operators U in send_pkt,,(p) and \ in receive_pkt,.(p) are operators on multisets, 
e.g., sr \ {(p,t)} removes one copy of (p,t) from sr. 
As for the clock subsystem it is easy to see that Ac: is in fact a safe timed I/O automaton. 


10.2.4 Liveness 


We need no liveness restriction (other that normal admissibility). Thus, Zc¢,:, should consist of 
all admissible timed executions of Acy:. This is specified by an environment-free timed liveness 
formula Qont, for Acnt, as follows. 


Qent, = true 


Qcnt, clearly is an environment-free timed liveness formula for Acy:. The g function of a (timed) 
strategy could be defined to add one copy to sr every time send_pkt ,,(p) occurs. The f function 
of the strategy should then simply be defined to wait the maximum time (d) before outputting 
a packet again. In this way (since d is positive), if the environment provides Zeno input, the 
resulting outcome will be Zeno-tolerant. In all other cases the outcome will consist of admissible 
timed executions only. That suffices. 


10.3. The Sender and the Receiver 


Above we have specified the clock subsystem and the timed channels explicitly as live time 
I/O automata. To specify the sender and receiver processes in C, we use the implicit approach 
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introduced in Section 4.2.1. That is, we describe the automaton part of both the sender and re- 
ceiver live timed I/O automaton as MM'T-specifications (cf. Definition 4.9) Ayarr,; and Ayr, 
respectively. 

When formally defining steps( Ajyyr7,,) and steps( Aygyr,,) below, we furthermore provide an 
intuitive description of the functionality of C. 


10.3.1 States and Start States 
Sender 


The identifiers used to tag messages at the C level are taken from the sender’s local clock and 
are thus also called timestamps. Thus, the domain of the variable last,, which contains the 
current timestamp, is T. The sender’s local clock is contained in time,. This variable must be 
stable, i.e., it must survive a crash. 


[Variable [Type | tnitialy 


mode , {idle, send, The mode of the sender. Compared 
rec} to G, the sender does not need a spe- 
cial needid mode. Instead the sender 
enters send mode directly from idle 
mode. 


The list of messages at the sender side. 
Same as at the G level. 


f= te} —}; —_{mnssecica 
current-Msq Msg U {nil} | nil The message about to be sent to the 
orem | PTT a 
last, T The timestamp chosen for the current 
eee the Ges 
current-ack Bool false Acknowledgement from the receiver. 
a ee ee 


S = Stable 


Receiver 


The receiver’s local clock is called time, and as for the sender’s local clock, it must be stable. 
The receiver also contains the variables lower, and upper,, both ranging over T. The role of 
these variables is to delimit the interval of timestamps that the receiver will accept. The variable 
upper,, Which is stable, is initialized to the special timing constant 3. Exactly how lower, and 
upper, are manipulated and what the properties of 8 must be will be described below. The final 
new variable is rm-time,. This variable holds the timestamp of the last message delivered to the 
user and is used to calculate when the receiver can safely clean up its state. This mechanism is 
also described below. 
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[Variable [| Type [iaiifally 
mode, {idle, rcvd, | idle The mode of the receiver. Same as at 
ack, rec} the G level. 
The list of messages accepted. Same as 
at the G level. 


The receiver’s local clock. 


last,. The timestamp of the last message 
Mo 
lower, T A lower bound on the timestamp of a 
ae TO new message that can be accepted. 
7 


[Ts B An upper bound on such a timestamp 


rm-time, Remembers the value of the local clock 
when the last message accepted was 
delivered to the user. Is used for clean- 
up purposes. 


nack- buf, The list of timestamps for which 
the receiver will issue a negative 
acknowledgement. 


S = Stable 


10.3.2 Actions 


Sender 


Input: 
send_msg(m), m € Msg 
crash. 
receive _pkt, ,(t,b), £€ T, 6 € Bool 
tick.(t), t€ T 
Output: 
ack(), 6 € Bool 
TECOVER g 
send_pkt,,(m,t), m € Msg, t ET 
Internal: 
choose_id(t), # € T 


Receiver 


Input: 
crash, 
receive _pkt,.(m,t), m € Msg, t €T 
tick,(t), t€ T 
Output: 
receive_msg(m),m € Msg 
TECOVEry 
send_pkt(t, b), t € T, 6 € Bool 
Internal: 
increase-lower,(t), ¢€ T 
increase-upper,(t), t € T 
cleanup, 
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10.3.3 Steps 


We now provide the formal definition of the steps of the underlying automata in the MMT- 
specifications of the sender and receiver. As always we list the definition of the steps of the 
sender in the left column and the definition of the steps of the receiver in the right column. 
However, first we provide the intuition behind the functionality of C. 


Informally C works as follows during normal mode of operation. The sender associates in a 
choose_id(t) step the timestamp t with the next message it wishes to transmit. The timestamp 
is obtained from the sender’s local clock time,, so the precondition for choose_id(t) guarantees 
that the local clock has advanced since the last time a timestamp was chosen (last, ). The sender 
is now in send mode and starts to transmit repeatedly the current packet over the channel to 
the receiver. The time between every retry, as we shall see formally in Section 10.3.6, is at most 
the constant /,. Based on this constant and the channel characteristics, it is possible to derive 
the maximum delay before the current packet is received. 

The receiver now uses the associated timestamp to decide whether or not to accept a received 
message—roughly, it will accept a message provided that the associated timestamp is greater 
than the timestamp of the last message that was accepted, which is kept in last,. However, the 
receiver does not always remember the timestamp of the last accepted message: it might forget 
this information because of a crash, or simply because a long time has elapsed since the last 
message was accepted and it is no longer efficient to remember it (see below). Therefore, the 
receiver uses safe time estimates determined from its own local clock (time,) to decide when 
to accept a message. The estimates are kept in lower, and upper,; the receiver accepts if the 
message’s timestamp is in the interval (lower, upper,,|. 

The lower, bound is designed to be at least as big as the time of the last message accepted. It 
can be bigger, however, but in this case is must be sufficiently less than the receiver’s local time 
(at least a maximum one-way message delay (plus a double clock skew) less). This is because 
the receiver should not accidentally fail to accept a valid message that takes the maximum time 
to arrive. We note that the reason why we do not want to remember just the last timestamp is 
that we envision using this protocol in parallel for many users, and a single lower, bound could 
be used for all users that have not sent messages for a long while. The special timing constant 
p signifies the amount by which lower, must be kept smaller than time, when incremented in 
increase-lower,(t) steps. In Section 10.3.6 we show how p should be related to the other timing 
constants of the system. 

The upper, bound is chosen to be big enough so that the receiver still accepts the most recent 
messages, even if they arrive very fast. That is, it should be somewhat larger than the current 
time (at least a double clock skew larger). But this bound is kept in stable storage, and therefore 
should not be updated very often. Thus, it will generally be set to be a good deal larger than the 
current local time. When we present the timing constraints in Section 10.3.4 below, we show that 
at most some time // elapses between every time upper, is increased (in an increase-upper,(t) 
step). The timing constant 3, which occurs in the definition of increase-upper,(t) below, then 
has to be properly related to /! in order to guarantee that upper, is always big enough. 

Unlike the H protocol, C will not continuously issue positive acknowledgements for the last 
packet successfully received. Instead it only issues one positive acknowledgement and returns 
to idle mode (cf. the definition of the send_pkt,.,(¢, true) steps below). If this packet is lost 
in the channel, eventually the receiver will receive another copy of the current packet; this will 
change mode, to ack and a new positive acknowledgement will be issued. After at most & retries, 
(t, true) is successfully placed in the buffer and after at most d time units thereafter, the sender 
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will receive the acknowledgement. Once send_pkt,.,(t, true) is enabled, it must occur within l, 
time units unless it is disabled in the meantime. This upper bound will be important in order 
to specify when the receiver is allowed to clean up its state. 

This completes a normal cycle of the sender and receiver. After the formal definition of the 
steps, we return to the description of the special cleanup, action and what can happen due to 


crashes and recoveries. 


send_msg(m) 
Effect: 
if mode; # rec then 


buf , := buf ,°m 


choose_id(t) 


Precondition: 
mode; = idle A 
buf, AeA 
times =tA 
t > last. 

Effect: 
modes := send 
last, := t 


current-msg, := head(buf .) 
buf , := tail(buf ,) 


send_pkt ,.(m, t) 
Precondition: 
mode; = send A 
current-msg, =m A 
last, =t 
Effect: 


none 


receive_pkt,..(m, t) 
Effect: 
if mode, # rec then 

if lower, <t < upper, then 
mode, := rcevd 
buf, := buf, °m 
last, := t 
rm-timer t= OO 
lower, :=t 

else if last, < t< lower, then 
nack-buf, := nack-buf,°t 

else if mode, = idle A last, = t then 
mode, := ack 


receive_msg(m) 


Precondition: 
mode, =rcvd A 
buf, Fe A 
head(buf,.) =m 

Effect: 


buf, := tail( buf, ) 

if buf, =e then 
mode, := ack 
rm-time, := time, 
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receive_pkt,, .(#, 6) 
Effect: 


if mode, = send A last, = t then 


modes := idle 
current-ack, := 6 
current-msg, := nil 


ack(b) 

Precondition: 
mode; = idle A 
buf, =€ 
current-ack, = b 

Effect: 


none 


crashs 
Effect: 


modes i= rec 


rECOVET g 

Precondition: 
mode, = rec 

Effect: 
modes := idle 
last, := times 
buf, :=€ 
current-msg, := nil 
current-acks := false 


tick .(t) 
Effect: 


time. :=t 
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send_pkt,,(t, true) 
Precondition: 
mode, = ack A 
last, =t 
Effect: 


mode, := idle 


send_pkt,,(t, false) 
Precondition: 
mode, # rec A 
nack-buf, A € A 
head(nack-buf,.) =t 
Effect: 


nack-buf, := tatl(nack-buf, ) 


crash, 
Effect: 


mode, := rec 


reCOVETy 
Precondition: 
mode, = recA 
upper, + 26 < timer 


Effect: 
mode, := idle 
last, := 0 
rm-timer t= OO 
buf, :=€ 
lower, := upper, 
upper, = time, + 8 
nack- buf, := € 


increase-lower;, (t) 
Precondition: 
mode, # rec A 
lower, <t< timer — p 
Effect: 


lower, :=t 


increase-upper,.(t) 
Precondition: 
mode, # rec A 
upper, <t= time, + 
Effect: 
upper, :=t 


cleanup, 
Precondition: 
mode, € {idle, ack} A 
time, > rm-time, + a 
Effect: 


mode, := idle 
last, := 0 
rm-timer t= OO 
tick, (t) 
Effect: 


timer := t 


10.3. The Sender and the Receiver 201 


All that needs to be kept in stable storage is just the local clocks time, and teme,, plus the 
one variable upper, of the receiver. When the receiver side crashes and recovers again (cf. the 
definition of recover, above), it resets its lower, bound to the old upper, bound, to be sure 
that it will not accept, and thus deliver, any message twice. This explains why we cannot just 
set upper, to infinity. It also explains another detail: the precondition for the recover, steps 
requires the local clock to grow beyond upper, + 2€ before recovery can take place. This is 
because otherwise the new lower, bound would be too big compared to tame, which could lead 
to the rejection of a very fast message sent to the system after the recovery of the receiver. If 
we were to allow such a rejection, C would not correctly (or even safely) implement S since S 
only allows the loss of messages which are in the system between crash and recovery. 

The way the receiver informs the sender that the sender is in a bad send state is similar 
to the way this is done at the G level: when the receiver receives a packet (m,t) where ¢ is 


not between lower, and upper, it should issue a negative acknowledgement for ¢t. However, 


ro? 
if t < last,, the receiver has already successfully received a message with a later timestamp, 
so (m,t) cannot be the current packet of the sender. In this situation the receiver does not 
issue the negative acknowledgement. (Note, that due to crashes or clean-ups (see below), the 
receiver may forget last,. However, in this case last, = 0, and the receiver will issue negative 
acknowledgements for all “bad” timestamps and, in particular, the current one.) 

Finally we consider the clean-up mechanism of the sender. When a long time has elapsed 
since the receiver started to issue positive acknowledgements for the last packet accepted, it can 
be sure that the sender has received the acknowledgement, and is thus allowed to forget last, 
and move to idle mode. This is specified in the definition of cleanup, above. Section 10.3.6 
describes how large the timing constant @ occurring in the precondition should be. 


10.3.4 Timing Constraints 
We can now specify sets(Ajgur,,), boundmap( Ayr; ), sets(Ayur,r), and boundmap( Ayr, ) 
and thus complete the MMT-specifications of the sender and the receiver. 


Sender 


The correctness of C depends on an upper bound on the send_pkt,,(m,t) actions of the sender. 
Thus, sets(Ajyr,,) contains only one set of locally-controlled actions and boundmap(Ayur,s) 
then associates a lower and upper bound on this set. Formally we have 


Cé,, = {send_pkt,,(m,t)|me Msg At €T} 
and 

b(CE,) 0 

bul Cs) l, 


where /, is a positive real. 


[I> [| 


Receiver 


Similarly, as mentioned above we put bounds on two sets of locally-controlled actions of the 
receiver. The two constants J, and l! are both positive reals. 


{send_pkt,.,(id, true) | id € ID} 
{increase-upper,(t) |t € T} 


t 
Cyrl 


t 
Co r2 


[|> [| 
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and 
bi(CE 1) = 0 
b(Con) = hh 
b(CE,2) = 0 
bul Cr2) = U 


10.3.5 The Sender and Receiver Safe Timed I/O Automata 


The safe timed I/O automata of the sender and receiver processes in C are now given by (cf. 
Definition 4.10) 


Acs 
Ac, 


time( Amur,s) 
time( Auur.r) 


[> [|p 


10.3.6 Derived Timing Constants 


Before we specify the liveness requirements for the sender and receiver processes of C, we return 
to the three timing constants 8, p, and @ occurring in the definition of the steps of the sender 
and receiver, and show how they should be related to the other timing constants. We give the 
intuition behind the constants, and in the proofs in Section 10.5 we show that the properties of 
the constants actually guarantee correctness. We first repeat the other timing constants, which 
are all positive reals: 


€ The maximum clock skew from real time (at both the sender and receiver side). 
i, An upper time bound between retransmissions of message packets (m,t) from the sender. 


l. An upper time bound between retransmissions of positive acknowledgement packets (t, true) 
from the receiver. 


i’ An upper bound between increase-upper,(t) steps of the receiver. (This upper bound will 
usually be bigger than J, since increase-upper,(t) writes to stable storage.) 


d An upper bound on channel delay. 


Furthermore, the channel retry number & is a fixed positive integer, which represents the number 
of retries that will guarantee delivery of a packet. 


We consider 3, p, and a one by one. 


The Timing Constant ( 


The timing constant 3 occurs in the definition of the increase-upper,(t) steps above and indicate 
the amount by which upper, should be set bigger than teme,. Assume that the sender’s local 
time is € ahead of real time and the receiver’s time is € behind. If the sender picks a timestamp 
for the current message and this message arrives very fast (in fact arbitrarily fast since we have 
no lower bounds in the system) at the receiver, the timestamp of this message will be 2¢ larger 
than the receiver’s local time. Since the message must be accepted, upper, must be at least 2 
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larger than time, at any moment (where the receiver is not crashed). When increase-upper,.(t) 
has occurred, it will recur before /' time units. Thus, 2 should satisfy 


B>2+U 


Note, the smaller is, the more often increase-upper,(t) steps (and thus writes to stable storage) 
are required to happen. On the other hand, if 3 is chosen too big, recovery will be delayed (cf. 
the definition of recover,.). 


The Timing Constant p 


The timing constant p occurs in the definition of the increase-lower,(t) steps above and indicate 
the amount by which lower, must be smaller than time,. The p bound should guarantee that 
very slow messages from the sender will still be accepted. Assume the sender’s local time is € 
behind real time and the receiver’s local time is € ahead. By the time the sender associates a new 
timestamp ¢ with the current message, t = tame, — 2e. Now, the sender will succeed in placing 
the current packet in the channel after at most & retries and the delay between each retry is at 
most /,. Thus, after k/, time units, from the time the timestamp was chosen, the current packet 
must have been placed in the channel, and after at most d time units the packet will be received. 
Thus, during the time of transmission, the receiver’s local time has increased by at most kl, +d 
time units (it cannot have increased by more since it was already the maximum amount ahead 
of real time). We finally get that the timestamp ¢ will be time, — kl, + d + 2e€ at the time of 
receipt in this worst case. Thus, 


p> kl, +d+ 2 


The Timing Constant a 


We finally consider a which occurs in the definition of cleanup,. Clearly, a is the most compli- 
cated of the timing constants. 

There is no bound on how fast new packets can arrive at the receiver, nor are there bounds 
on how fast the receiver delivers accepted messages to the user. The a bound has to indicate 
the first time by which it is no longer necessary to remember last,. This bound thus has to be 
calculated from the time the last message accepted (i.e., the message for which last, gives the 
timestamp) is delivered. 

We consider a situation where neither the sender nor the receiver crashes. 

Let nowrm be a real time when receive_msg(m) occurs and buf, becomes empty, and let 
UME, pm be the corresponding value of time,. Also, let NOWsend-ack, denote the real time when 
the receiver performs its ith send_pkt,,(t, true) step for the current timestamp t (contained in 


last,). We have, 
NOW send-ack,1 < NOW rm + l. 


The maximum delay until the receiver receives (m,t) again is kl, + d. (Just before the receiver 
performed send_pkt,.,(t, true) the sender might have succeeded in putting a copy of (m,t) into 
the channel, and this copy could be fast such that it arrives with no delay at the receiver, i.e., 
just before send_pkt,,(t, true). Since such copies are not buffered by the receiver, the receiver 
has to wait for the next copy which arrives after at most kl, + d time units.) Thus, 


NOW send-ack,2 < NOW send-ack,1 + (Kl, + d + FS) 
= NOW pm +1, + (kl, +d + 1,) 
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And for the kth send_pkt,.,(t, true), 


NOW send-ack,k < NOW send-ack,k—1 T (kl, + d + l,) 


= NOW send-ack,k—2 T 2(kl, + d + L,) 


NOW send-ack,1 (k 1)(Kl, d } L,) 
= NOWrm +1, + (k — 1)(kl, + d+ l,) 


Now, let now ack-reva be the real time when (t, true) is received by the sender and let time, ack-reva 
be the corresponding value of teme,. 


NOW ack-revd < NOW send-ack,k + d 
= NOWrm +1, + (k — 1)(kl, + d+.) +d 
= NOWrm + k(l. + d) + (k — 1)kl, 


Since time, —¢€ < now and time, + € > now, we have 

UME» ack-revd —e < NOW ack-revd 
NOW rm + KUL, + d) + (k — 1)kl, 
timer rm +€+ kl, + d) + (k — 1)kl, 


S 
S 


Thus, 


UME», ack-revd < HMEy rm K(1, d) (k 1)kl, 2€ 


Since the state variable rm-time, of the receiver is set to time, at the time of the last 
receive_msg(m) step, we see from the definition of cleanup, that a should satisfy 


a> kl, +d)+(k—1)kl, + 2e 


Note that 
e a depends on k? (but fortunately not on kd). 


e the 2¢ in a is actually not obtained as the maximum difference between sender and receiver 
clocks but as two times the maximum receiver clock skew. 


10.3.7 Liveness 


The liveness requirements to the sender and receiver processes of C are weak fairness to sets of 
locally-controlled actions. 


Sender 


Let 


Cos 


{ack( true), ack( false), recover, } U 
{choose_id(t)|t € T}U 
{send_pkt,.(m, id) | m € Msg A id € ID} 
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Then the liveness condition Lo, is induced by 
Qe,s = WF(Cos) 


Note, that it is actually not necessary to add the send_pkt,,.(m, id) actions to Cc, since these 
actions are already constrained by the stronger timing requirements. 

In the untimed setting weak fairness to locally-controlled actions is trivially environment-free. 
This is not necessarily the case in the timed setting. The problem is that even with the simple 
weak fairness requirements, the system might still collaborate with a Zeno environment and 
generate outcome timed executions that are not Zeno-tolerant. However, Qc, is environment- 
free for Ac,,. Intuitively, consider a strategy that for actions in CG, always waits the maximum 


delay /, before performing an action in Cos The actions in Cc, should then be handled 


similarly with some arbitrary positive real number as bound. If the sets CZ, and Cc,, becomes 


disabled, there are no requirements so the strategy should just let time pass forever. With this 
strategy, if the environment is not Zeno, each outcome timed execution will be in Lo,,, and if 
the environment is Zeno, each outcome timed execution will be Zeno-tolerant. 

Finally note that, by Proposition 3.4, Qc, is stuttering-insensitive. 
Receiver 


Similarly, let 


Con =  {recover,}U {receive.msg(m) | m € Msg} U 
{send_pkt,.,(id, true) | id € ID} 
Com. = {send_pkt,,(t, false) | t € T} 


Then Lo, is induced by 


Qer = WF (Co +1) A 
WF (Cora) 


As for the sender, Qc, is stuttering-insensitive and environment-free for Ac,. 


10.4 The Specification of C 


C is the parallel composition of sender, receiver, two channels, and clock subsystem. First define 


C” = (Ag, LS) as, 
Cv = C,||C,|]Chi,||Ch,,||Cl 
By Proposition 4.17, L% is induced by Qc, which is defined as 


Qc = Qos \Qer A Qent, A Qent, A Qa 


C” has channel communication as well as ticks from the clock subsystem as external (output) 
actions. To obtain a specification where the ticks are hidden, define 


Ay = {tick,(t)|t © T}U {tick,(t) |t € T} 
Then C’ = (AG, LG) is defined as 


Cc’ Cc" \ Ao 
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By Proposition 4.18, DG is induced by Qe. 
Finally, to get C, we hide the channel actions. First define 


Ac =  {send_pkt,.(m,t)|m€ Msg At € T}U 
{receive_pkt,.(m,t)|me Msg AteT}u 
{send_pkt,.,(t,6) |t € TA b € Bool} U 
{receive_pkt,.,(t,b)|t€T A 6 € Bool} 


Then the specification of C = (Ac, Lc) is given by 
C 2 C'\Ac 
Again, by Proposition 4.18, Lc is induced by Qc. 


We now turn to proving the correctness of C. This involves, among other things, use of the 
Embedding Theorem of Section 2.3. 


10.5 Correctness of C 


The objective of this section is to prove correctness of C—not with respect to G but with respect 
to the patient version of G. Then the Embedding Theorem of Chapter 2 will allow us to conclude 
that C correctly implements patient(S). 

First, recall that the G protocol uses a set ID of identifiers that has to satisfy certain 
conditions (cf. Section 8.1). We instantiate this set with the time domain T, which clearly 
satisfies the conditions. Thus, we set [D = T in the proofs below. 

Next, recall from Section 9.4 that we first proved that H’ correctly implements G’, where 
H’ and G’ are the versions of H and G with channel communication as external actions. This 
was because the Execution Correspondence Theorem gives a stronger result the more external 
actions the systems have in common. The same motivation leads us first to consider the proof 
that C’ correctly implements patient(G’). Thus, let G?’ = (A®,’, L2,') be defined as 


G?’ = patient(G’) 


By Proposition 4.22, L?,’ is induced by Qg and Qq is minimal. 

In order to prove that C’ correctly implements G?’, we first enhance C’ with history variables 
and thereby obtain C” = (AR DB), We then prove several invariants of Ab! and show the 
existence of a timed refinement mapping from An! to A®,’. Finally, this refinement result is used 
to prove that cr correctly implements G?’ and, in turn, that C correctly implements patient(S). 


10.5.1 Adding History Variables 


We add two history variables to C’ and denote the resulting live timed I/O automaton by 
CH= (AN TB), 


[warble [type tna peserption 


used , H | T* The list of timestamps used by the 
sender. Same as at the G level. 


deadline H | TU {oo} An estimated deadline on arrival of the 
current packet. 


H = History 
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We now show how the history variables should be updated (cf. Section 9.4.1 where history 
variables are added at the H level). We refer to Section 5.2.5 for a description on how we are 
allowed to manipulate the history variables. 


choose_id(t) 
Precondition: 
(* Precondition from Cs *) 
Effect: 
(* Effect clause from Cs *) 
used, := used, “t 


if mode, # rec then 
deadline := now + kl, +d 


receive_pkt,,.(m, t) 
Precondition: 
(* Precondition from Chi, +) 
Effect: 
(* Effect clause from Chi, *) 
(* Effect clause from C, *) 


if mode, # rec then 
if lower, <¢t < upper, then 


if t = last. A mode, = send then 
deadline := co 


else if last, < t< lower, then 


else if mode, = idle A last, = t then 


crash. crash, 
Effect: Effect: 
(* Effect clause from Cz *) (* Effect clause from C, *) 
deadline := co deadline := co 


By Lemma 5.32, L4 is induced by Qc. 


10.5.2 Invariants 


In this section we state the invariants of Ab we need below. The proofs are deferred to Ap- 
pendix C. 

The first invariant deals with the local clocks of the sender and receiver in Ab and states 
that the maximal clock skew for these is €«, which then implies that tame, and time, can differ 
by at most 2e. 


Invariant 10.1 


1. time, = ctime, 


2. time, = ctime, 
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3. |time, — now| < € 


—_ 


. |time, — now| < € 


5. |time, — time,| < 2e 


When the receiver is not in recovery mode, upper, is updated regularly to ensure that timestamps 
chosen by the sender are never “too big”. This is expressed by the following invariant. 


Invariant 10.2 


1. If mode, # rec then upper, > now + € 


2. If mode, # rec then upper, > time, 


3. If mode, # rec then upper, > time, 
| 


The following invariant deals with last,. Since the local clock tame, can never decrease and due 
to the facts that the current timestamp is taken from time,, and last, gets reset to time, after a 
crash, it is the case that last, is always greater than or equal to tame,. Furthermore, the current 
timestamp (i.e., the value of last, when mode, = send) can never be 0. 


Invariant 10.3 


1. last, < time, 


2. If mode, = send then last, > 0 
| 


The state variable last, contains the timestamp of the last message accepted by the receiver (or 
0 right after recovery or cleanup). The next invariant states that the value of last, can never be 
considered a good timestamp by the receiver. (Otherwise the receiver could accidentally accept 
the same packet twice). Specifically, last, is always less than or equal to lower,. Furthermore, 
lower, is always less than or equal to upper,. 


Invariant 10.4 


1. last, < lower, 


2. lower, < upper, 


The next invariant states that the number of unsuccessful attempts (since the last successful 
attempt) to send a packet (m,t), where t > last,, is always 0. Actually, no attempts can ever 
have been made to transmit (m,t) since the sender cannot yet have issued the timestamp ¢. 
Furthermore, the number of unsuccessful attempts (since last successful attempt) to send any 
packet can never be greater than or equal to & (the channel retry number). 
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Invariant 10.5 


1. If ¢ > last, then count,,(m,t) = 0 
2. count,,(m,t) <k-1 


The following invariant is a key invariant and states properties of timestamps associated with 
messages and acknowledgements in the channels. 


Invariant 10.6 
1. If (m,t) € packets(sr) then t < last, 
2. If (m, last,) € packets(sr) A mode, = send then m = current-msq, 
3. last, < last, 
4. If (t, true) € packets(rs) then t < last, 
5. If ¢ € nack-buf, then t < lower, 
6. If (¢,6) € packets(rs) then t < lower, 
a 
Properties of the relationship between lower, and last, are stated in the following invariant. 
Invariant 10.7 
1. lower, < time, 
2. If last, < time, then lower, < time, 
a 


The sender chooses increasing timestamps as indicated by the next invariant. 


Invariant 10.8 
1. If ¢ precedes ¢’ in used, then t < ¢ 
| 


Due to the way the channels deal with the maximum channel delay d, the following invariant 


holds. 


Invariant 10.9 


1. If ((m,t),t’) € sr then t! < now +d 
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To state the next invariant, we need a few definitions. Define the function mintime with the 
following signature 


mintime : P x (B(P x T)) = T 


in the following way 


mintime(p, ch) 0 otherwise 


a tt if (pt) € ch AV(p,t') € ch: (t! >t) 


Thus, mintime(p, ch) gives the minimal send time associated with the packet p in ch (and 
defaults to 0 if p € packets(ch)). Remember from the way we model the channels sr and rs that 
each element in the channels has two times associated with it: one is a timestamp chosen by 
the sender; the other represents the real time when the element was put into the channel and is 
called the send time of the packet. The function mintime returns send times. 

For any state s of Ab we define s.bound in the following way, where we use m and ¢ as 
shorthands for s.current-msg, and s.last,, respectively. 


oO if s:mode, # send 
d + mintime((m, t), s.sr) if s.mode, = send A 
s.bound = (m,t) € packets(s.sr) 


s.last(CG ,.) + (k —1—s.count,,(m,t))l, + d_ if s.mode, = send A 
(m,t) € packets(s.sr) 


Thus, s.bound represents an estimated time of arrival for the current packet. With this definition 


we can prove very important properties of the history variable deadline. 


Invariant 10.10 
1. bound < deadline 


2. now < bound 

3. now < deadline 

4. If deadline £ o then deadline < last, +e+hl,+d 
5. If deadline £ o then now < last, + e+ kl, +d 

6. If deadline oo then last, > lower, 


7. If deadline # o then mode, = send A mode, # rec 
| 


The receiver is allowed to clean up its state, i.e., to forget the timestamp of the last message 
accepted and move to idle mode, when a sufficiently long time has elapsed since the message 
was delivered to the user. This is because by then the receiver can be certain that the sender 
has received a positive acknowledgement packet for the current packet. In the specification of 
the receiver, a indicates how long time the receiver must wait before cleaning up. The following 
invariant captures the fact that a is properly defined. We do not prove the invariant but note 
that it can be proved in a fashion similar to the proof of Invariant 10.10. 
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Invariant 10.11 


1. If mode, = send A mode, # rec A time, > rm-time, +a then last, F last, 


The final two invariants are trivial and state that any timestamps occurring in the channels are 
positive. 


Invariant 10.12 
1. If (m,t) € packets(sr) then t > 0 
| 


Invariant 10.13 
1. If (4,6) € packets(rs) then t > 0 
| 


We refer to the conjunction of the invariants above by Icn. 


10.5.3 Safety 


We now define a function from states( Al’) to states(A?,’). Below, in Lemma 10.15, this function 
is proved to be a timed refinement mapping from Ab to AP,’ with respect to Ig. and Ig. (Note, 
that the invariant [g of Ag is clearly also an invariant of Ab,”.) 

Below we use the notation (¢,,t2] to denote both the left-open interval from a to 6 and the 
set {t | 4) <t <t,}. Similar notation is used for the other kinds of intervals. 


Definition 10.14 (Refinement Mapping from A®, to A?,’) 
If s € states( Al’) then define Rcg(s) to be the state u € states(A®,’) such that 


1. u.now = s.now 
u.mode, = s.mode, 
u.buf , = s.buf, 
u.current-msg, = §&.current-msq, 
u.current-ack , = s.current-ack, 
aw.used, = s.used, 
aw.mode, = s.mode, 
u.buf,. = s.buf, 
u.nack-buf ,. = s.nack-buf,. 
2. u.last, = (if s.dast, = 0 then nil else s.last,) 
u.last, = (if s.last, = 0 then nil else s.last,.) 
3. u.good , = {s.time,} \ {s.last,} 
4. u.good,, = (s.lower,, s.upper,,| 
5. u.issued, = (0,s.upper,| 


6. u.current-ok = (s.deadline 4 0) 
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7. U.sr = packets(s.sr) 
packets(s.rs) 


= 
5 
w 

| 


Note how the values of most variables at the G level correspond directly to the value of the same 
variables at the C level as expressed by Part 1. Part 2 gives the trivial correspondence for the 
last, and last, variables. Parts 3-5 contain the interesting aspects of the mapping: good,—the 
timestamps the sender can associate to messages—consists of the value of tame,, but only if 
the clock has increased since the last timestamp was chosen; otherwise good, is empty; good, 
is, as expected, the left-open interval from lower, to upper,; finally, the receiver has issued all 
timestamps up to and including upper,. The correspondence in Part 6 between current-ok at 
the G level and deadline at the C level is obvious. Finally, Part 7 states that each channel at 
the G level is obtained from the corresponding channel at the C level by removing the send time 
components of all elements. 


We now prove that Reg is in fact a timed refinement mapping from An to A?,’ (with respect 


to Ice and Ig). 


Lemma 10.15 


/ i . 
Ah. <UR Aa vid Rea. 


Proof 


We prove that Reg is a timed refinement mapping from Ab to A®,’ with respect to Ic. and Ig. 
We check the three conditions (which we call real time correspondence, base case, and inductive 
case, respectively) of Definition 5.18. 


Real Time Correspondence 
From the definition of Rog we see that for all states s of C, Reg(s).now = s.now as required. 
Base Case 


For the initial condition, let s be the start state of C. Then it is easy to check that Rcg(s) is a 
start state of A?,’. 


Inductive Case 


Assume (s,a,s’) € steps(A’’) such that s and s' satisfy Icn and Rya(s) satisfies Ig. Below 
we consider cases based on a (and sometimes subcases of each case) and for each (sub)case we 
define a finite execution fragment a of A®,’ of the form (Rea(s), a’, u",a",ul”,..., Rea(s’)) with 
vis-trace(a) = vis-trace(a). For brevity we let u denote Ryc(s) and u’ denote Ryg(s’). 


a=VvV 


Then (u,v,u') € steps(A®,’): the only change in going from s to s’ is that the now variable 
increases, thus, by definition of Rog, the only difference between u and w’ is that the now 
variable of A®,’ increases and all such changes are allowed in AQ’. 
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a € {send_msg(m), receive_msg(m), ack(b)} 


Then it is easy to see that (u,a,u') € steps(A®,’). This step (and finite execution fragment) 
clearly has the right visible trace. 


a € {crash,, crash, } 


Then it is easy to see that (u,a,u’) € steps(A®,’). This step (and finite execution fragment) 
clearly has the right visible trace. 


The only thing to note here is the handling of deadline. The step of Ab changes deadline to co 
but this corresponds, according to the definition of Rog, to changing current-ok to false in A®,’ 
as required by the definition of the crash actions in A®,’. 


a = recover, 


We show that (uw, recover,, wu’, shrink_good ,(s.time,),u’), where wu” is defined below, is a finite 
execution fragment of A®,’ by showing that (uw, recover,, ul) and (u", shrink_good ,(s.time,), wu’) 
are steps of A?,’. Clearly the execution fragment has the right visible trace. 


Define u”.mode, = idle 
wu” last, = s.time, 
ul" .buf , = € 
u” .current-msg, = nil 
wu” .current-ack, = false 
uw a = u.x for the remaining state variables x 


First, consider (uw, recover,, wu’). From the definition of recover, in Ab we have that s.mode, = 
rec which implies, by the definition of Req, that also u.mode, = rec. Thus, recover, is enabled 
in u. Then, by definition of uw” and recover, in A®,’, clearly (u, recover,,u!) € steps( AP’). 


Next, consider (w”, shrink_good,(s.time,),u’). The definition of shrink_good, in A®,’ has no 
precondition, so shrink_good ,(s.time,) is enabled in uw’. From the definitions of wu” and Reg we 
have that u.good, = u.good, C {s.time,}. 


We must show that the differences between wu” and w’ are allowed by the definition of the 
shrink_good,(s.time,) steps in A®,’. This amounts, by the definition of shrink_good,(s.time,) in 
Ab’, to showing that u’.good, = u'.good, \ {s.time,} and that all other state variables of A?,’ 
have the same values in u” and w’. 


For good, we have that u’.good, = @ (since s’.time, = s’.last,), but from above we have 
u" good, C {s.time,}, so u’.good, = u" good, \ {s.time,} as required. 


It is easy to check that the rest of the state variables of A?,’ have the same values in uw’ and wu’. 


T = recover, 


We show that 
(u, shrink_good,,((s.lower,, s.upper,|), wu”, grow_good,.((s.upper,., s.time, + [3]), wu”, recover,,u’), 
where wv” and wu” are defined below, is a finite execution fragment of A?’ by showing that 
u, shrink_good,,((s.lower,,s.upper,]), wu”), (w'’, grow_good,((s.upper,, s.time, + 3]),u'’), an 
h , k dl, [ , if if dl, , ti “ee d 
ul, recover,,u') are steps of A’,’. The execution fragment clearly has the right visible trace. 
a g & 
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Define w.good, = 9 
uae = wu. for the remaining state variables «x 


First, consider (u, shrink_good,((s.lower,,s.upper,]), wu”). From the precondition of the recover, 
steps in Ab and the definition of Raq we have that u.mode, = s.mode, = rec. Then In- 
variant 8.6 Part 2 implies that u.current-ok = false, thus, shrink_good,.((s.lower,, s.upper,.|) is 
enabled in w. Since the definition of Rcg implies that u.good, = (s.lower,, s.upper,], it is easy 
to see that (u, shrink_good,((s.lower,, s.upper,]),u’) € steps( AP’). 


Define wv” issued, = (0,8.time, + (| 
ul good, = (s.upper,,s.time, + {] 
wa = wu.2 for the remaining state variables x 


Next, consider (u”, grow_good,((s.upper,., s.time, + 3]),u’”). By definition of uw” and Rog we 
have that w issued, = u.issued, = (0,s.upper,]. So, (s.upper,, s.time, + 3] and w’issued, do 
not intersect. Also, by adding (s.upper,, s.time, + 3] to issued, we still have infinitely many 
unused timestamps left in T. Thus, grow_good,((s.upper,, s.time, + 3]) is enabled in wu’. Since 
u’.good, = by definition, it is easy to see that the change in good, is as required by the 
definition of the grow_good,((s.upper,, s.time, + 3]) steps in A®,’. To show that also issued, is 
handled correctly, we must show that u’.issued, = u’.issued, U (s.upper,., 8.time, + [3], i.e., we 
must show that (0, s.time, + 3] = (0, s.upper,]U (s.upper,, s.time, + 3]. A sufficient condition 
for this to hold is that s.time, + 6 > s.upper,, but this is implied by the precondition of the 
recover, step in AMY, To leave all other state variables unchanged is also as required by the 
definition of grow_good,((s.upper,,, s.time, + 3]) in AR’. 


Finally, consider (u’”, recover,, u’). We have u'”.mode, = u.mode, = s.mode, = rec, so recover, 

is enabled in wu’. We show that all state variables are handled according to the definition of 
. i . . . 

recover, in A@. The only interesting cases are issued, and good,.. 


For issued, we have u” issued, = (0, s.time,+/] by definition of wu” and furthermore wu’ issued, = 
(0, s’.upper,] = (0,s.time, + 8] by definition of Reg and the recover, step in Ab’ Thus, 
u issued, = u'.issued, and this is allowed by the definition of recover, in A®,’ if |T\s’.issued,| = 
oo which is clearly satisfied and if w’.isswed, includes a) wu” issued,, b) w”.used,, and c) u”.good,. 
Case a) is clearly satisfied. For b) we have u”.used, = u.used, = (0, s.last,]. Thus, we must show 
that s.dast, < s.time, + G, but this follows from s.last, < s.time, < s.time, + 2€ < s.time, + GB, 
where the first inequality follows from Invariant 10.3 Part 1, the second inequality follows from 
Invariant 10.1 Part 5, and the third inequality follows from the definition of 3. For c) we have 
ul good, = u.good, = {s.time,} \ {s.last,}. It suffices to show that s’.time, < s'.upper, (since 
s'.time, = s.time, and s'.upper, = s.time, + 3), but that follows from Invariant 10.2 Part 2. 
Thus, issued, is handled correctly. 


For good, we have u’”.good, = (s.upper,, s.time, + 3] and u’.good, = (s'.lower,, s’.upper,] but 
since s’.lower, = s.upper, and s'.upper, = s.time, + 9, by definition of the recover, step in An, 
we have that u.good, = u'.good, as required by the definition of recover, in A?,’. 


a € {send_pkt ,.(m,t), send_pkt,,(t, true), send_pkt,.,(t, false) } 


It is straightforward to show that (u,a,u’) € steps(A%’). This step (and finite execution frag- 
ment) clearly has the right visible trace. 


a = receive_pkt,,.(m, t) 


We consider cases. 
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1. s.mode, # rec and s.lower, <t < s.upper,. 


We show that (u, receive_pkt,,.(m,t), uw”, shrink_good,((s.lower,., t]), u’), where wu” is defined 
below, is a finite execution fragment of A®,’ by showing that (u, receive_pkt,,.(m,t), u”) and 
(w", shrink_good,((s.lower,t]), uw’) are steps of A?,’. Clearly the execution fragment has the 
right visible trace. 
Define u"”.good, = u.good, \ {t'|t' <, t} 

ua = w.« for the remaining state variables « 
First, consider (au, receive_pkt,,(m,t),u”). By the case assumption and the definition of 
Roa, we have u.mode, # rec and t € u.good,. Then, by definition of receive_pkt,,.(m, t) in 
Ab,’ and w it is easy to see that (u, receive_pkt,,(m,t), uw’) € steps( AP’). 
Then consider (wu, shrink_good,((s.lower,,t]), u’). We show that shrink_good,.((s.lower,., t]) 
is enabled in uw”. Assume w.current-ok = true (otherwise shrink_good,((s.lower,,t]) is 
trivially enabled). Then, by definition of receive_pkt,,.(m,t) in A?,’ we have u’.last, 4 t or 
u” mode, # send. By the precondition of shrink_good,((s.lower,,t]), we must show two 
conditions. 
1) First, since mode, ranges over {idle, send, rec} in Ah’, we have u.mode,(= u!.mode,) # 
needid. Thus, the first condition is satisfied. 
2) Second, assume w”.mode, = send. We must show that u’.last, ¢ (s.lower,,t]. From 
above we have wu” .Jast, # t. Then since s’.last, = u.last, = wu’ last, = t, Invariant 10.6 
Part 3 implies t < u”.last,. That suffices. 
Thus, shrink_good,((s.lower,,t]) is enabled in w”. 


We must show that all state variables of A®,’ are handled correctly. This is easy for all 
variables other than good, by explicit definition of u”. 


For good, we must show that u’.good, = u”.good, \ (s.lower,,t]. Since s’.lower, = t and 
s'.upper, = s.upper,., the definitions of Reg and u” imply w”.good, = (s.lower,., s’.upper,.]| \ 
{t’ | t! <, t} and w’.good, = (t,s’.upper,). Thus, it suffices to show that if t/ <, t, then 
t <t, but that follows directly from Invariant 10.8 Part 1. That suffices. 


2. s.mode, = rec or 7(s.lower, < t < s.upper,) 


We show that (uw, receive_pkt,.(m,t), wu’) € steps(A?,’). This step (and execution fragment) 
clearly has the right trace. 


We consider subcases. 


(a) mode, = rec. 
In this case the only difference between s and s’ is that s’.sr is missing one element 
((m,t), t’) compared to s.sr. Thus, the only difference between wu and w’ is, by definition 
of Rog, that u’.sr is missing one packet (m,t) compared to w.sr. 
Since s.mode, = rec we have u.mode, = rec, so in this case it is easy to see that 
(u, receive_pkt,.(m,t), u') € steps(A®,’). 

(b) mode, # rec, a(s.lower, <t < s.upper,), and last, <t < lower,. 
In this case the only difference between s and s’ is that s’.nack-buf, = s.nack-buf,. °t 
and s’.sr is missing one element ((m,t),¢”) compared to s.sr. Then the definition of 
Rog implies that wu’ and u are the same except that wu’.nack-buf,. = u.nack-buf,°t and 
u’.sr is missing one packet (m,t) compared to u.sr. 
Now, the definition of Rog implies that u.mode, # rec and t ¢ u.good,, and since 
s.last, < t, u.last, # t. Thus, by definition of receive_pkt,.(m,t) in A?,’, it is easy to 
see that (u, receive_pkt,,.(m,t),u’) € steps( A®,’). 


216 10. The Clock-Based Protocol C 


(c) mode, # rec, a(s.lower, < t < s.upper,), a(last, <t < lower,), mode, = idle, and 
last, = t. 
In this case the only difference between s and s’ is that s’.mode, = ack and s’.sr is 
missing one element ((m,t),¢”) compared to s.sr. Then the definition of Rog implies 
that u’ and u are the same except that u.mode, = idle, s.mode, = ack and w'.sr is 
missing one packet (m,t) compared to s.sr. 
We have, by definition of Reg that u.mode, = idle and t ¢ u.good,. Furthermore, 
the case assumption and Invariant 10.12 imply that s.last, > 0, so, by the definition of 
Rea, u.last, = s.last, = t. Then, by definition of receive_pkt,,(m,t) in A?,", it is easy 
to see that (u, receive_pkt,,.(m,t), u’) € steps(A®,’). 

(d) mode, # rec, a(s.lower, < t < s.upper,), a(last, < t < lower,), and (mode, # idle 
or last, # t). 
In this case the only difference between s and s’ is that s’.sr is missing one element 
((m,t), t’) compared to s.sr. Thus, the only difference between u and w’ is, by definition 
of Rog, that u’.sr is missing one packet (m,t) compared to u.sr. 
We must show that the definition of receive_pkt,,(m,t) in A®,’ allows all state variables 
except sr to be unchanged. (The change to sr is as required by receive_pkt,.(m, t).) 
As in the previous case we have u.mode, # rec and t ¢ u.good,. Thus, according to 
the definition of receive_pkt,,.(m,t) for the receiver of A’, the required changes to the 
state variables are not given by the first alternative in the embedded if-statement. 
Now assume t # s.last, (cf. the case assumption). Then also t # u.last,. Then, 
by definition of receive_pkt,,(m,t) in A?,’, we see that in order for A?,’ to allow 
u’ .nack-buf,, = u.nack-buf,, it suffices to show that t 4 u.last,. By the case assumption 
and Invariant 10.2 Part 2, Invariant 10.3 Part 1, and Invariant 10.6 Part 1,¢ < s.last,. 
Thus, u.last, = s.last, > t. That suffices. 
Finally, assume that t = s.last, and mode, # idle. Then it is clearly the case that 
(u, receive_pkt,,.(m,t), u') € steps( AX’). 


a = receive_pkt,.,(t, ) 


We show that (u, receive_pkt,.,(t, 6), u’) € steps(A®,’). This step (and finite execution fragment) 
clearly has the right visible trace. 


Since (¢,6) € packets(s.rs), the definition of Rog gives (t,b) € u.rs. Thus, receive_pkt,.,(t, b) is 
enabled in wu. 


We consider cases based on the if-statement in the definition of receive_pkt,,.(t,b) of the sender 
in An, In both cases a ((t,b),t’) element of s.rs gets removed and this corresponds, by the 
definition of Rog, to removing a (t, 6) element from u.rs, but this is as required by the definition 
of receive_pkt,,(t,b) in A®,’. Below we consider the remaining state variables of AQ,’. 


Assume s.mode, # send or s.last, # t. Then the only difference between s and s’ is the 
change in the channel rs as described above, so the only difference between wu’ and wu is the 
corresponding change in sr (according to Rcg). Now, the definition of Reg implies that 
u.mode, # send or u.last, # t so we see, from the definition of receive_pkt,,(t,6) in AQ’, 
that (u, receive_pkt,,(t,b), u’) € steps( At’). 

Then, assume s.mode, = send and s.last, = t. From Invariant 10.13 we have t > 0, so the 
definition of Rag implies that u.mode, = send and u.last, = t. Thus, the condition of the 
if-statement in A?,’ is satisfied. It is now easy to see that the changes made by Ab correspond 
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to allowed changes in Ab, (Note that u.last, = u’.last, but this is allowed by the definition of 
receive_pkt,,,(t, 6) in A®,’). 


a = choose_id(t) 


We show that (wu, prepare, wu”, grow_good ,(t), u’’ choose_id(t), u”, shrink_good,(t), u’), where wu”, 
ul", and wu” are defined below, is an execution fragment of A?,’ by showing that (u, prepare, wu’), 
(uw, grow_good,(t), u!”), (wl, choose_id(t), ul”), and (u!", shrink_good,(t), u’) are steps of AP,’. 
Clearly the execution fragment has the right visible trace. 


Define u”.mode, = needid 
u" good, = o 
u current-msg, = head(u.buf,) 
ul” buf, = tail(u.buf,) 
u" .current-ok = (if u.rec, # rec then true else u.current-ok) 
ul a = u.x for the remaining state variables x 


We first consider (w, prepare, wu’). From the precondition of the choose_id(t) steps in Ah’ we have 
that s.mode, = idle and s.buf, # ¢. This implies, by the definition of Rog, that u.mode, = 
idle and u.buf, = s.buf, #¢. Thus,prepare is enabled in u (and furthermore the definition of 
u" is well-defined). Now, by definition of u’, clearly (u, prepare, wu’) € steps( A?,’). 


Define u.good, = {t} 
we = wa for the remaining state variables x 


Next, consider (w”, grow_good,(t),u’”). We have, from the definition of wv’, that u’.mode, = 
needid, so from the definition of grow_good,(t) in A®,’ we have to show three conditions in 
order to show that grow_good,(t) is enabled in wu”. First, assume w’.mode, # rec. We must 
show ¢t € wu’ .issued,. We have wu”. issued, = u.issued, = (0,s8.upper,] (by definition of u” and 
Roa) and t = s.time, > s.last, (from the precondition of choose_id(t) in Al’), so we must 
show that s.time, < s.upper, but that follows from Invariant 10.2 Part 2. Second, assume 
u’.current-ok = true. We must show ¢ € wu’.good,, thus since u”.good = u.good,, we must 
show time, € (s.lower,, s.upper,]. The lower bound follows from Invariant 10.7 Part 2 since the 
precondition of the choose_id(t) step in Ab implies that s.dast, < s.time,. The upper bound 
is already shown in the treatment of the first part of the precondition above. Third, we must 
show that t ¢ w’’.used,, thus we must show that s.time, ¢ (0,s.last,] but that follows from the 
precondition of the choose_id(t) steps in Ab’ Thus, we have shown that grow_good,(t) is enabled 
in uw”. Now, by definition of wu” and since u”.good, = 9, obviously (u”, grow_good,(t), ul”) € 
steps( AP,’), 


Define u’’.mode, = send 
wl” last, = t 
uw used, = w" used, t 
ul" ae = wx for the remaining state variables « 


Next, consider (w'”, choose_id(t), wu”). By the definitions of wv”, wu’, and Rog we have that 
wu mode, = needid and t € w.good,(= {t}). Thus, choose_id(t) is enabled in wu”. By 
definition of w’” and choose_id(t), clearly (ul, choose_id(t), ul”) € steps( A®,’). 


Finally, consider (w!”, shrink_good,(t), u’). From the definition of shrink_good,(t) in A?,’ we see 
that we must show that wu” and w’ are the same except that w’.good, = wu” good, \ {t}. From the 
definition of Reg and the choose_id(t) step of A,’ we have u/.good, = {s'.time,}\{s'.last,} = 0. 
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Thus, since w””.good, = u' .good, = {t}, the condition on good, is satisfied. It is trivial to check 


that all other state variables of A?,’ are handled correctly. 


a = increase-lower,(t) 


We show that (u, shrink_good,.((0,t]), u’) € steps(A®,'). This step (and finite execution fragment) 
clearly has the right visible trace. 


From the precondition of increase-lower,(t) in Ab we have s.mode, # rec and s.lower, <t < 
8.lime, — p. 


We first show that shrink_good,((0,t]) is enabled in wu. If u.current-ok = false then this 
is obvious. So assume u.current-ok = true. We must check two conditions. First assume 
u.mode, = needid. Then we must show that (0,t]M u.good, = @ which, by definition of Rca, 
amounts to showing (0,t] N ({s.time,} \ {s.last,}) = @. Thus, it suffices to show t < s.time, 
which, by definition of increase-lower,(t) in An’ is the same as showing s’.lower, < s’.time,, 
but this is implied by Invariant 10.3 Part 1 and Invariant 10.10 Part 6, where the latter in- 
variant applies since u.current-ok = true implies s.deadline # oo which again, by definition of 
increase-lower,(t), implies s.deadline # oo. For the second condition in the precondition we 
must show, under the assumption that u.mode, = send, that u.last, # t, which is implied by 
proving s’.last, 4 s’.lower,. Again, Invariant 10.10 Part 6 gives the result. 

Thus, shrink_good,((0,¢]) is enabled in w. 


To show that (u, shrink_good,((0,t]),u’) € steps(A®’) we must finally show that u’.good, = 
u.good,, \ {t} and that all other state variables in A?,’ have the same values in u and u’. By defini- 
tion of Rog and increase-lower,(t) we have u.good, = (s.lower, s.upper,| = (s.lower,, s’.upper,,| 
and w’.good, = (t,s'.upper,], so since t > s.lower,, by the precondition of increase-lower,(t), 
it is easy to see that the condition for good, is satisfied. Since the increase-lower,(t) step of 
Ah only changes lower, and lower, is only used in the definition of Rog to define good,, it is 
obvious that all state variables, but good,, of A®,’ have the same values in wu and uw’. 


a = increase-upper,,(t) 


We show that then (u, grow_good,((s.upper,.,t]), u’) € steps( A®,’). This step (and finite execution 
fragment) clearly has the right visible trace. 


Since, by definition of Rca, u.issued, = (0, s.upper,], it is obvious that w.issued, 1 (s.upper,,t] = 
and that |T \ (w.issued, U(s.upper,., t])| = 00. Thus, a grow_good,.((s.upper,.,t]) step is enabled 
in wu. 

Now we first show that w’.issued, = u.issued,U(s.upper,., t] and u’.good, = u.good,U(s.upper,., t], 
as required by the definition of grow_good,,((s.upper,,t]) in A®,’. For issued, we have u.issued, = 
(0, s.upper,,| and w’.issued, = (0, s’.upper,] = (0,t]. Now, since t > s.upper,, by the precondition 
of increase-upper,(t), the condition for issued, is clearly satisfied. For good, we similarly have 
u.good, = (s.lower,,s.upper,|] and u’.good, = (s'.lower,,s’.upper,| = (s.lower,,t]. Thus, the 
condition for good, is also satisfied. 


We must finally show that all other state variables in A?,’ have the same values in u and wu’, but 
this is obvious since the increase-upper,.(t) step of Ah only changes upper,, and upper, is only 
used in Rog to define good, and issued,. 
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a = cleanup, 


We show that (uw, cleanup,, wu’) € steps(A?,"). This step (and finite execution fagment) clearly 
has the right visible trace. 


By the precondition of cleanup, we have s.mode, € {idle, ack} and s.time, > s.rm-time, +a. 
By the definition of Reg and Invariant 10.10, we have u.mode, € {idle, ack} and u.mode, => 
u.last, # u.last,. Thus, cleanup, is enabled in wu. 


It is now easy to see that the variable changes specified by the cleanup, step of An correspond 
to the required variable changes of the cleanup, step of A®,’. (The change of rm-time, in Ab 
does not affect any of the variables of A?,’). Thus, (u, cleanup,, u’) € steps( A?,’). 


a= tick, 
We consider cases. 


1. s'.time, = s.time, 
In this case clearly s’ = s and thus wu! = u. Then the finite execution fragment u of A?,’ 
has the right properties. 
2. s'.time, # s.time, 
We show that (uw, shrink_good,(s.time,), uv", grow_good ,(s’.time),u’), where wu” is defined 
below, is a finite execution fragment of A?,’ by showing that (u, shrink_good,(s.time,), uw") 
and (u", grow_good,(s'.time), u’) are steps of A®,’. Clearly this execution fragment has the 
right visible trace. 
Define u’.good, = 9 
ua = wu. for the remaining state variables «x 
First, consider (u, shrink_good ,(s.time,),u”). Note that trivially shrink_good ,(s.time,) is 
enabled in u. We check that all state variables of A?,’ are handled correctly. By the 
definition of Rog we have u.good, C {s.time,}. Then, since w.good, = 0, good, is handled 
correctly. By definition all other variables of A?,’ have the same values in u and w”, which 
is also as required by the definition of shrink_good,(s.time,) in At’. 
Then, consider (uw, grow_good ,(s’.time), u’). By definition of Rcg (and the fact that mode, 
ranges over {idle, send, rec} in Abs), we have u.mode, # needid and consequently, by 
definition of wu”, u.mode, # needid. This shows that grow_good ,(s’.time) is enabled in 
ul 
By Invariant 10.3 Part 1, s.dast, < s.tame,. The Case Assumption together with the 
precondition of the tick, steps of the clock subsystem implies that s’.time, > s.time,. 
Then since s’.dast, = s.last,, we have s’.time, # s'.last,. This implies, by definition of 
Rog that u'.good, = {s'.time,}. Thus, good, is handled as required by the definition of 
grow_good ,(s'.time) in AP’. It is easy to see that all the remaining variables of A?,’ have the 
same values in w” and wu’ which is also as required by the definition of grow_good,(s'.time) 
in A?,’. That suffices. 


a — tick, 


We show that w’ = uw. Then the finite execution fragment u clearly has the right properties. 


Now, clearly uw’ = uw since the tick, step of Ab only changes time, and ctime,, and these variables 
are not mentioned in the definition of Raa. 


220 


This concludes the simulation proof. 
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This simulation result allows us to prove that Ab safely implements A®,’, and, in turn, that Ac 


safely implements G?. 


Lemma 10.16 


ale pt 
A& =St AG 


Proof 


Immediate by Lemmas 10.15 and 5.23. 


Theorem 10.17 
Ac Eg, patient( Ac) 


Proof 


By Lemma 10.16 and Lemma 5.29 we get 


A& Est patient( AG) 


which by substitutivity (Lemma 2.33) implies 
A \ Ac Est patient( AG 


)\ Ac 


which, by definition of Ag and Ac, gives 


A \ Ac Est patient( AG 


)\Aa 


By Proposition 2.38 we then get 
AG \ Ac Est patient( AG \ Ac) 
which finally, by definition of Ac and Ag, gives the result 


Ac Eg, patient( Ac) 
| 


10.5.4 Correctness 


The liveness proof presented in this section is significantly simpler than the liveness proof in 
the proof of correctness of H. The reason is that the sender and receiver processes are very 
similar in C and G, and that the packets sent to the channels at the two levels are of the same 
type. Recall that at the H level, additional packet types (needid, accept, and done) made the 
liveness proof very complex. 

Actually, the only preliminary lemmas we need, express the fact that the timing requirements 
of the timed channels are sufficient to guarantee the liveness requirements specified for the 
untimed channels used at the G level. 


Lemma 10.18 


1. exec™( Ab’) & Vp : ( 


2. exec®(Al’) E Vp: WE(receive_pkt,,(p)) 


©(send_pkt..(p)) 


}(receive_pkt,,.(p))) 
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Proof 
We only sketch the proofs. 


1. Consider any packet p and assume a is an admissible execution of Ab such that a & 
©(send_pkt,,.(p)), thus, send_pkt,,(p) occurs infinitely often in a. For every k occurrences 
of send_pkt,,(p) at least one element of the form (p,t), where ¢ is the send time for p, is 
placed in sr. By the maximum channel delay d, we have that not later than real time t+d 


a receive_pkt,,(p) action occurs. Then, since a is admissible, for every & occurrences of 
send_pkt ,.(p) in a there is at least one occurrence of receive_pkt,,(p). Thus, since there 
are infinitely many occurrences of send_pkt,.(p), there are infinitely many occurrences of 
receive_pkt,,(p), i.e, a F AO(receive_pkt,,(p)). That suffices. 


2. Consider any packet p and assume a is an admissible execution of Ab such that for some 
suffix a, of a, a; — O(p € packets(sr)) (the enabling condition for receive_pkt,,.(p) is 
(p € packets(sr))). Then, for any time t, a receive_pkt,,(p) action occurs not later than 
time ¢ + d since all packets much have left the channel after at most the channel delay 


time d. Then, since a is admissible, infinitely many occurrences of receive_pkt,,.(p) occur 
in a,. Thus, a; F OO(receive_pkt,,(p)). That suffices by definition of WF’. 


Lemma 10.19 
1. evec™(A’) K Vp : (AO(send_pkt,,(p)) (receive _pkt,.,(p))) 


2. exec? (A) E Vp: WE(receive_pkt,.,(p)) 


Proof 
Similar to the proof of Lemma 10.18. 
a 


We can now show the main part of the liveness proof, namely, if a is a live execution of C* and 
a’ is an execution of G?’ such that (a,a’) € Reg, then a’ is live. As usual, we prove this result 
by contradiction. Thus, we assume that a’ is not live and then derive a contradiction with the 
fact that a is live. 


Lemma 10.20 


Let a € exec®(Al’) and a! € exec®(A®,') be arbitrary admissible executions of Al and AP, 
respectively, with (a,a’) € Rog. AssumeaE Qc. Then a’ — Qa. 


Proof 
We prove the conjecture by contradiction. Thus, 


ASSUME: a’ - Qa 
PRovE: False 
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(1)1. a’ -= AWF (Ca sjri) V 
=O(O(mode, = needid \ mode, # rec) => O(C@ s/r2)) V 
AWE (Ca s/rs) V 
AWE (Ca, s/ra) V 
AVp : (A0(send_pkt,,.(p)) (receive _pkt ,.(p))) V 
aVp : WF (receive_pkt,,.(p)) V 
AVp : (A0(send_pkt,.,(p)) ©(receive_pkt,.,(p))) V 
Vp: WF (receive_pkt,.(p)) 
Proor: Immediate by the Assumption, the definition of Qg, and the Boolean operators. 
(1)2. Case: a! F AWF (Ca s/r1) 
(2)1. a!  OO(mede, € {idle, send, rec}) A O0O7(C@ s/r1) 
Proor: From Case Hypothesis (1) by noting that enabled(C@ sjm) = (mode, € 
{idle, send, rec}) and by expanding WF. 
(2)2. a — OO(mode, € {idle, send, rec}) A OOA7(Ce@ sri \ {prepare }) 
Proor: From (2)1 by definition of Reg and by Lemmas 5.25 and 5.26. 
(2)3. a — OO(mode, € {idle, send, rec}) A 
OOA(CG sri \ {prepare}) A 
©OA7({ choose_id(t) | t € T}) 
Proor: By (2)2 and the definition of A2’. Consider a suffix a, of a that satisfies 
a, F O7A(Ce s/n \{prepare}). Then if mode, is send it will stay send unless a crash 
occurs, in which case mode, changes to rec. However, once in mode rec, the sender 
will stay there since no recover, occurs in a,. Now, choose_id(t) actions can only 
occur if mode, = idle. However, then the sender never returns to mode idle again, 
as we have just seen. Thus, there is at most one occurrence of a choose_id(t) action 
in a,. This gives the result. 
(2)4. a & OO(mode, € {idle, send, rec}) A O0O-(Ce,) 
Proor: By (2)3 and the definition of Co... 
(2)5. ae AWF (Cos) 
Proor: From (2)4 by using the definitions of WF and Co.. 
(2)6. Q.E.D. 
Proof: (2)5 contradicts the assumption that a F Qc. 
(1)3. CASE: a’ — AO(O(mode, = needid \ mode, # rec) => O(CGs/r2)) 
(2)1. a’ — OO(moede, = needid \ mode, # rec) A OOA(C s/r2) 


Proor: Directly by Assumption (1). 
(2)2. a — OU(mode, ¢ {idle, send, rec}) 

Proor: By (2)1, the definition of Reg, and Lemma 5.26. 
(2)3. Q.E.D. 


PRooF: (2)2 contradicts the fact that always mode, € {idle,send,rec} at the C 
level. 
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(2)2. 


(2)3. 


(2)4. 


. CASE: a! — AWE (Ca s/rs) 
(2)1. 


a’ — OO(mode, = rec V (mode, = revd A buf, # ©) V mode, = ack) A 
© a(Ca,s/r3) 


Proor: By Assumption (1) and the definitions of WF and enabled(Cg 5,3). 


. a - OU(mode, = rec V (mode, = revd A buf, # €) V mode, = ack) A 


© a(Ca,s/r3) 


Proor: From (2)1 by definition of Rog, the fact that Co.;-3 contains external 
actions only, and Lemmas 5.25 and 5.26. 


~_ae aAWP(Co +1) 


Proor: By (2)2 using the definition of WF, the fact that Cop: = Ce.s/r3, and the 
definition of enabled(Co +1). 


. QED. 


Proof: (2)3 contradicts the assumption that a F Qc. 


. CASE: a! FE AWE (Ca sr) 
(2)1. 


Q.E.D. 


Proor: Similar to Case (1)4 we get a EF AWF(Co,2), which contradicts the as- 
sumption that a — Qc. 


. CASE: a! — AVp : (OO(send_pkt ,,.(p)) © (receive_pkt,.(p))) 
(2)1. 


a’ & dp: (OO(send_pkt ,.(p)) A OO7(receive_pkt,,.(p))) 


Proor: Directly from Assumption (1). 

a — dp: (OO(send_pkt,,.(p)) A OOn7(receive_pkt,,.(p))) 
Proor: By (2)2, Lemma 3.5 Parts 7 and 8, and Lemma 5.25. 
a - AVp: (AO(send_pkt,,.(p)) © (receive _pkt,,.(p))) 
Proor: Directly from (2)2. 

Q.E.D. 

Proor: (2)3 contradicts Lemma 10.18 Part 1. 


. CASE: a! FE AVp: WE (receive_pkt,,.(p)) 
(2)1. 


a’ & dp: a WF (receive_pkt,,.(p)) 


Proor: Directly from Assumption (1). 


a’ & dp: OO(pé sr) A OO7(receive_pkt,,.(p)) 


Proor: By (2)1 and the definition of WF. 


( 
. a dp: OO(p € packets(sr)) A OOn7(receive_pkt,,.(p)) 


Proor: By (2)2, Lemma 3.5 Parts 7 and 8, the definition of Reg, and Lemmas 5.25 
and 5.26. 


. aE Wp: WF (receive_pkt,.(p)) 


Proor: Directly from (2)3 and the definition of WF’. 
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(2)5. Q.E.D. 
Proor: (2)4 contradicts Lemma 10.18 Part 2. 
(1)8. Case: a’ — AVp : (OO(send_pkt,.,(p)) ©(receive_pkt,.,(p))) 


Proof: Similar to (1)6 using Lemma 10.19 Part 1. 
(1)9. Case: af E AVp: WF (receive_pkt,,(p)) 
Proof: Similar to (1)7 using Lemma 10.19 Part 2. 
(1)10. Q.E.D. 
Proor: By (1)1 and the exhaustive cases (1)2—(1)9. 


With this result, the timed refinement mapping result of the previous section, and Lemma 5.24 
we can prove that cr correctly implements G?’. 


Lemma 10.21 
Ch Ch, Ge’ 


Proof 
Immediate by Lemmas 10.15, 10.20, and 5.24. 
a 


This lemma allows us to prove that H correctly implements patient(G). 


Theorem 10.22 
C Cy; patient(G) 


Proof 


By Lemma 10.21 and Lemma 5.30 we get 
C’ Cy, patient(G’) 
which by substitutivity (Lemma 2.33) implies 
C’\ Ac Ex: patient(G’) \ Ac 
which, by definition of Ag and Ac, gives 
C’\ Ac Ext patient(G’) \ Aa 
By Proposition 2.38 we then get 
C’\ Ac Ext patient(G’ \ Aa) 
which finally, by definition of C and G, gives the result 
C Cy, patient(G) 


Finally, we can state and prove the main result, namely that C correctly implements patient(S). 


Theorem 10.23 
C Cy; patient(S) 
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Proof 


By Theorems 7.18 and 8.19 and the fact that Cy is transitive, we have G Cy S. Then the 
Embedding Theorem (‘Theorem 2.37) implies patient(G) Cy, patient(S). This, Theorem 10.22, 
and the fact that Ey, is transitive finally give the result. 


10.6 A “Weak” Clock-Based Protocol 


In the previous section we have considered the Clock-Based Protocol C and shown that it 
correctly implements the patient version of the specification S. In the specification of C we have 
made some timing assumptions. Specifically, we have assumed a certain channel retry number 
k and a mazimum channel delay d. Now, what if these assumptions are somehow violated in a 
physical implementation of the C protocol? What if a communication wire is damaged during 
some construction work and rerouting leads to a transmission delay greater than d for some 
packet p? Could the C protocol then suddenly reorder or duplicate messages? The answer is 
“no”. C is in [LSW91] designed to guarantee ordered at-most-once delivery even if all the timing 
assumptions are violated. However, in case of timing violation the system might lose messages 
even if no crashes occur, but message loss is generally considered less damaging than duplication. 

We suspect that this scenario is general for timing-based communication protocols: without 
timing assumptions the protocols satisfy some minimal requirements (like at-most-once message 
delivery), and with timing assumptions the protocols satisfy additional properties (like exactly- 
once message delivery in the absence of crashes). 

Our proofs above do not indicate that C guarantees at-most-once delivery even if the timing 
assumptions are violated. A formal proof of this property would show that a “weak” version of 
C with no timing assumptions safely implements a “weak” version of S that allows messages to 
be lost at any time. Note, that the reason why we only need to prove safe implementation as 
opposed to correct implementation is that “at-most-once message delivery” is a safety property. 

In order not to have to redo many of the proofs above when performing the proof between 
the weak versions of the protocols, we think that the proofs should be structured as follows: 
first prove that the weak version of C safely implements the weak version of S$. Then add the 
additional assumptions, prove additional invariants, and extend the first proof to prove correct 
implementation. 

In a temporal logic setting, like TLA [Lam91], “additional assumptions” are added as new 
conjuncts to the specifications. Proof of safe implementation, which is expressed as implication 
in the logic, should then use the new conjuncts of the specification to prove the new conjuncts 
of the implementation. Exactly how this should be performed in our setting is left for future 
research. 


10.7 The Clock-Based Protocol With One Receiver and Multi- 
ple Senders 


Consider the situation depicted in Figure 10.2. The picture shows a situation where several 
receivers—each interacting with a single sender—are placed on the same node. Thus, n copies 
of the sender, receiver, and channels from above are put in parallel. Instead of implementing n 
identical copies of the receiver on the receiver node, a single optimized process can be designed 
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Sender, Receiver, 


Sender,, Receiver, 


Figure 10.2 


The Clock-Based Protocol with several receivers on the same node. 


that implements the parallel composition of the receivers. Then, due to the substitutivity results 
for live timed I/O automata (Proposition 2.33), such a multiple-sender receiver senders (called 
the ms-receiver) will work in concert with the n senders. Below, we let ss-receiver denote the 
single-sender receiver from above. 

In [LSW91], the receiver of the Clock-Based Protocol is in fact designed to handle multiple 
senders. This receiver has a structure very similar to the ss-receiver. However, it is optimized so 
that only one single upper, variable is needed. This is important since upper, variables must be 
kept stable and stable updates are expensive. Furthermore, “old” lower, variables, i.e., lower, 
variables for senders that have not sent messages for a long time, can be cleaned up such that 
sufficient information about these old variables can be kept in a single common lower, variable. 

This section discribes the design of the ms-receiver of [LSW91] and sketches the proof that it 
implements the parallel composition of n ss-receiver. It turns out that because of the similarities 
between the ms-receiver and the ss-receiver, the proof is very simple. 


Figure 10.3 shows the visible actions of the ms-receiver. There are n versions of the channel 
actions, receive message actions, and recovery actions but only one of both crash, and tick,. 
This user interface is then the same as one would get by composing n copies of the ss-receiver in 
parallel after indexing all locally-controlled actions with the index of the ss-receiver. It may seem 
strange to have a recovery action for each index; however, since the ms-receiver should implement 
and, thus, have the same user interface as the parallel composition of n (renamed) ss-receivers, 
and since live timed I/O automata cannot synchronize on output actions (like recovery), it is 
inevitable that the ms-receiver has n recovery actions. One should, thus, think of the ms-receiver 
as offering recovery of its n parts, one by one. 

Let Cys be a live timed I/O automaton modeling the ms-receiver. It should, then, be 
proved that 


Cins,p Cre C,.1|| use |Crn 


where C,,.; = pi(C,) and the function p; maps each locally-controlled action of C, to an indexed 
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receive_msg,(m) 


receive _pkt,, 1 (p) 


send_pkt,, 4 (p) receive_msg,,(m) 


. crash, 
ms-recelver 
receive_pkt,,. ,(p) TECOVET r,1 
send_pkt,, (P) 
TECOVET rn 


tick, 


Figure 10.3 


The visible actions of the ms-receiver. 


version of the same action, and is the identity mapping for the remaining actions. For instance, 
pi maps receive_pkt,,(p) to receive_pkt,, ;(p). (Actually, the processes C,1,...,C,,, are not 
compatible in the strong sense where the ordinary state variable names of different processes 
are required to be non-overlapping. So, for present purposes, assume that all state variables of 
C,.; (except now) are indexed with 7.) 


We do not define C,,,, completely formally but sketch how it works. First, recall that in C,., 
lower, indicates a lower bound on timestamps that the receiver will accept. Every time a new 
message is accepted, lower, is advanced to the timestamp of that message. Furthermore, special 
increase-lower, steps are in C, allowed to increase lower, as long as it is kept small enough to 
allow very slow messages from the sender to be accepted. 

Cinsr Contains n versions (lower, 7,...,lower,,) of lower,—one for each sender—and each 
variable lower,; remembers the last timestamp received from the ith sender in order to ensure 
that only messages with later timestamps will be accepted from that sender in the future. 
In Cys, lower,; is only advanced when packets are accepted from the 7th sender, i.e., in 
receive_pkt ,,. ;(p) steps. 

Now, Cys, furthermore contains a common-lower, variable. This variable is increased in 
special increase-common-lower, steps, and whenever it advances past the value of a lower,; 
variable, this lower,,; variable is changed to nil, i.e., is cleaned up. Thus, common-lower, 
captures all relevant information about the timestamps that must be accepted from senders 
that have not sent for a while, as long as common-lower, is kept sufficiently small. 

Also, Cms,- only needs a single upper, variable, which gives the upper bound on timestamps 
that can be accepted from any sender. 

Figure 10.4 shows how an inerease-common-lower, step changes a lower,; variable to nil. In 
situation a), C,,,, will accept timestamps in the interval (common-lower,., upper,| from sender 
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lower; = nil lowerr1 lower,3 


common-lower, upper, 


lower; = nil lower; =nil lower,3 


common-lower, upper, 


Figure 10.4 


The difference between situation a) and b) is that an increase-common-lower, step of Csr 
has advanced common-lower, and thereby has cleaned up lower,.3 (by changing it to nil). 


2 and timestamps in the interval (lower,;, upper,] from sender 7 € {1,3}. In situation b), 
lower, has been cleaned up and C,,,, will consequently now only accept timestamps in the 
interval (common-lower,, upper,| from sender 1. However, this is safe since common-lower,. is 
kept sufficiently small (in the same way the lower, variable is kept sufficiently small in C,.). 


All other variables of C,, except tame,, have n versions in C,,,,. For instance, C,,. has the n 
buffers buf,.,,..., buf,.,,. However, of course, only one local receiver clock time, is needed. 

We only specify the most interesting steps of C,,,,. These are the steps labeled with 
receive_pkt ,. ,(m,t) or increase-common-lower,(t) actions. 


8ryt 
receive_pkt,, ;(m, t) 
Effect: 
if mode,,; A rec then 
if (lower,,; Anil A lower,; <t < upper,.) V 


(lower,,; = nil A common-lower, < t < upper,.) then 
mode,,; := revd 
buf; = buf,,; °° m 
last, := ¢ 
rm-timey,; t= OO 
lower, ; :=t 
else if (lower, #4 nil A last, <t < lower,,) V 
(lower,,; = nil A last; < t << common-lower,) then 


nack-buf, ; := nack-buf,; °t 
else if mode,,; = idle A last,; = ¢ then 
mode,,; := ack 


increase-common-lower,(t) 
Precondition: 
Vi: (mode,,; # rec) A 
common-lowerr <t < time, — p 
Effect: 
common-lower, := t 
for all ¢ with lower, Anil: 
if common-lower, > lower, then 
lower, := nil 


Note, that the timing constant p, which occurs in the definition of increase-common-lower, 
steps, is the same constant as for the ss-receiver above. 
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Steps labeled by crash, should in C,,, change all mode, ; variables to rec. 


It requires a timed refinement mapping to verify that C,,,, correctly implements C,.;|]---||C,n- 
This refinement mapping R,,, maps most variables one-to-one. Let s be any state of Crs». 
Then R,,;(S) is the state u that for all 7 satisfies 


© u.upper, ; = S.upper,,. 

e u.time,; = s.time,. 

e u.lower,; = (if s.lower,; # nil then s.lower,; else s.common-lower,.). 
e u.c = s.x for the remaining variables x. 


It is fairly straightforward to verify that R,,, actually is a timed refinement mapping. The way 
lower,.; is defined in the mapping implies that a receive_pkt,, ,(m,t) step of Cs, directly corre- 
srl, t) step of C,1|]++-||C,.n- In fact, there is the same one-to-one cor- 
respondence for all other actions, except for increase-common-lower,(t) and increase-upper,.(t). 

A increase-common-lower,(t) step of C,,;, may change several lower,; variables to nil. 
This corresponds at the abstract level to these lower,; variables being advanced. Thus, an 
increase-common-lower,(t) step of C,,5 corresponds to a series of increase-lower,;(t)—one for 
each process identifier i for which lower,; = nil in C,,,, after the increase-common-lower,(t) 
step. 

An increase-upper,(t) step simply corresponds a sequence of steps labeled increase-upper,. ,(t), 

.., increase-upper,, ,,(t). 


sponds to a receive_pkt 


We do not complete the modeling of C,,,, in this report but leave this and the complete simu- 
lation and liveness proofs for future work. 


Chapter 11 


Conclusion 


11.1 Summary 


This report contains two parts. Part I describes the formal models of [GSSL93] for timed and 
untimed systems, and the associated simulation-based proof techniques. Also, an extended tem- 
poral logic is developed, in which temporal formulas evaluate over executions of alternating states 
and actions and, thus, are well-suited for describing and reasoning about liveness conditions—in 
the timed setting via sampling characterizations of timed executions. It is furthermore shown 
how application of the semantic operators of parallel composition, action hiding, and action 
renaming is reflected in the syntax. 

The proof techniques are used to prove that one system correctly implements a more abstract 
system. A proof generally consists of three parts. First, several invariants of the systems are 
proved. Then, secondly, a relation is defined and proved to be a simulation relation from the 
concrete to the abstract system. During this process, one generally has to go back and prove 
additional invariants. Finally, a liveness proof builds on top of the simulation result. 

Part II presents a case study intended to check the adequacy of the formal framework on 
large examples. In particular, two practical protocols for solving the at-most-once message 
delivery problem on channels that may delete, duplicate, and reorder packets are considered. 
One protocol is the Five-Packet Handshake Protocol of [Bel76], which is the standard protocol for 
setting up network connections, used in TCP, ISO TP-4, and many other transport protocols. 
The other protocol is the Clock-Based Protocol of [LSW91], which relies on certain timing 
assumptions. Both protocols are sufficiently complicated that it seems that formal proof is the 
only means by which their correctness can be verified. 

Both the specification 5 of the at-most-once message delivery problem and the Five-Packet 
Handshake Protocol, which we call H, are formalized as live I/O automata, however at very 
different levels of abstraction. The specification 5 corresponds closely the the informal descrip- 
tion of the at-most-once message delivery problem, and is easily checked to have the desirable 
behavior. H is expressed as the parallel composition of several components. 

The Clock-Based Protocol, which we call C, is formalized as a live timed I/O automaton. A 
special MM'T-specification style is used to specify the sender and receiver in a clear way since 
the timing restrictions on these components are of the simple form: if a set of actions becomes 
enabled (or stays enabled after being executed), then an action from the set must be executed 
after some lower time bound and before some upper time bound, unless the set is disabled in the 
meantime. C is formalized in the timed model and 5 in the untimed model. It is argued that 
in this case correctness of C should be expressed with respect to the patient version of 5, i-e., 
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the object of the timed model that behaves just like 5, except that it allows arbitrary passage 
of time. 


Instead of proving directly that H and C correctly implement S and patient(S), respectively, the 
correctness proof is split into smaller parts by introducing intermediate levels of abstraction. In 
particular, both H and C can be seen as implementations of an (untimed) Generic Protocol G. 
By introducing intermediate levels of abstraction, not only do we get the advantage of splitting 
complicated proofs into smaller parts, we also avoid that proofs of similar parts will have to be 
repeated in the correctness proofs for both H and C; instead these similar parts are captured 
in G and in the proof that G correctly implements 5. In fact, we believe that G is sufficiently 
general so that other practical protocols can be proved to be correct implementations of G. 

A direct proof that G correctly implements 5 is still very complicated since it involves a 
backward simulation, and backward simulations seem to be inherently difficult. Thus, to limit 
the backward simulation to a development step as small as possible, the Delayed-Decision Spec- 
ification D was defined. In this way the correctness proof for D requires a backward simulation, 
whereas the correctness proofs for lower levels of abstraction only require the use of the simpler 
(timed) refinements (plus the use of history variables). 

The report contains full proof of correctness for the protocols. However, some of the proofs 
are only sketched, when similar formal proofs are found elsewhere in the report. 


11.2 Evaluation 


The operational models of live (timed) I/O automata, the syntax for describing these, and the 
proof techniques have proved to provide a powerful formal framework within which both untimed 
and timed distributed systems can be formalized and proved correct. The abstract specification 
is close to the informal problem statement and the formalism offers a clear, intuitive, and modular 
approach to the description of the low-level protocols. In particular, for timed systems, where 
the only timing restrictions are lower and upper time bounds on progress, the MM'T-style offers 
a clear notation. 

It should be noted, however, that the example presented in this report only proves correctness 
of a timed protocol with respect to the patient version of an (untimed) specification. This means 
that the timing assumptions of the timed protocol are only used to prove certain invariants, 
whereas the handling of time the simulation proofs is almost trivial. [LA91] deals with timed 
simulation proofs (with non-patient specifications) for MMT-style systems. 


Some aspects of performing the correctness proofs are intellectually challenging. In particular, 
defining simulation relations involves a lot of insight and intuition about the systems, and also 
finding the sequence of abstract steps that corresponds to a given concrete step requires key 
intuition. In fact these two aspects of the proofs provide important documentation of the 
functionality of systems and can be used to convey intuition about these. 

However, in a simulation proof one must prove that the sequence of abstract steps has the 
right properties. This involves checking that the steps are in fact steps of the abstract system, 
which, in turn, amounts to checking that each variable is handled according to the abstract 
transition relation. This part of the proof involves a lot of tedious details, and forms a quite 
sizable part of the total proof. Because of the details, the proof is very difficult to maintain; 
sometimes, during a proof attempt, one has to go back and change either the abstract or the 
concrete specification, which may lead to a need to change part of the proof already done. 
Unless extreme care is taken, such changes are likely to introduce inconsistencies in the proof. 
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Apart from this, simulation proof techniques scale well to large examples and impose a nice case 
structure on the proof. 

Liveness proofs are also challenging. They, too, require insight into the way the protocols 
work. The temporal logic offers an expressive way formalize liveness conditions and an ad hoc set 
of rules. Our liveness proofs are not proofs of validity of temporal formulas, but instead proofs 
of satisfaction, i.e., that certain executions satisfy the temporal formulas. In the proof steps 
temporal rules, which have the form of valid implications, meta rules, and semantic reasoning 
are used. This seems to provide a straightforward way of performing careful liveness proofs by 


hand. 


Live (timed) I/O automata, temporal logic, and simulation-based proof techniques are good 
tools for formally specifying and verifying timed and untimed communication protocols. 

The embedding results of the model tie the untimed and timed models together in a very 
general and useful coordinated framework that allows proving that a timed system correctly 
implements an untimed specification. 


11.3. Further Work 


There is a considerable amount of further work remaining. We have already begun the work of 
automating simulation proofs in the untimed model, by proving the equivalence of versions of 
S and D using the Larch Prover [SGGt93, GG91]. We have been pleased with the preliminary 
results: the prover has not only been able to check our hand proofs, but in fact has been able to 
fill in many of the details. Current research tries to use the same approach on a timed forward 
simulation. Future research should consider automation of more complicated simulation proofs. 

Second, if the timing assumptions on C are weakened or removed, the resulting algorithm 
still will not deliver any message more than once; however, it may lose messages even in the 
absence of a crash. It remains to formulate the weaker specification and prove that the weaker 
version of C satisfies it. 

Third, there are other algorithms that solve the at-most-once message delivery problem, for 
example, using bounded identifier spaces or cryptographic assumptions. We would like also to 
verify these, preferably reusing as much of our proofs as possible. 

Finally, future research should deal with the extended temporal logic developed in this work, 
and try to find a basic set of rules that is adequate for the liveness proofs of typical distributed 
systems. The rules presented in this report, which are specifically tailored for the case study, 
seem to be a good starting point for such an investigation. 


11.4 Conclusions 
We can draw several conclusions: 


e Live (timed) I/O automata, temporal logic, and simulation-based proof techniques provide 
a powerful coordinated framework for formally specifying and verifying timed and untimed 
communication protocols. 


e The proof techniques, especially simulation proofs, scale well and are not too difficult 
to use. It is challenging and requires insight and key intuition to find, e.g., the right 
simulation relations, and a lot of detailed work to verify these choices. For large proofs, 
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computer assistance is essential to help with the details; however, the insight will always 
be required. 


Backward simulation proofs are much harder to do than refinement mapping and forward 
simulation proofs but are necessary in certain situations. It seems to be worthwhile to try 
to limit the use of backward simulations to as small a development step as possible. 


Many practical protocols can be treated as implementations of a common abstract protocol. 


Verifying a coordinated collection of protocols, rather than just a single isolated protocol, 
is extremely valuable. It leads to the discovery of useful abstractions, and tends to make 
the proofs more elegant. 


Doing proofs for realistic communication protocols is feasible now. We predict that it will 
become more so, and will be of considerable practical importance. 
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Appendix A 


Basic Definitions 


This appendix gives basic definitions used in this report. 


A.1 Record Notation 


If a variable or value is of tuple type, e.g., X x Y x Z, we will use the normal record notation to 
extract the sub-values. For example if d has type X x Y x Z, d.x will extract the first component 
of the tuple, etc. 


A.2 Sets 


We use standard notation for sets. A set consisting of the elements e€,,€,... we write as 


{e1, €Q5+- } 
and a notation like 
{f(@) | TENA g(t) = 4} 
is used to denote the set of all elements f(i), where 7 is a natural number such that g(2) = 4. 
A singleton set with the element e is sometimes written e instead of {e}. As usual we use 


€ to express set membership, and C and C to express the proper subset and subset relations, 
respectively. The empty set is denoted by 0. Furthermore we use the normal operators on sets 


U Union 
1 Intersection 

Complement (with respect to some given set) 
\ Set minus 


Set Type 
For any set 5S, denote by P(5') the set of all (finite or infinite) subsets of S. 


Cardinality 
The cardinality of a set $', written ||, is defined as 


5] 4 { if S has n elements 


oo if S has infinitely many elements 
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A.3 Bags (Multisets) 


For bags we use the following operators from the previous section: 
Is],9,U, € 


|s| counts the total number of elements (including duplicates) of s. 
Bag Type 


For any set S', denote by B(5') the set of all (finite or infinite) bags with elements from S. 


A.4 Lists and Sequences 


In this report we use the terms “sequence”, “list”, and “queues” synonymously. 


A list [ consisting of the elements e€;,€2,... we will write in one of the ways 
i = (€0, E1s-. .) 
i = €a,€15--- 
i = Eg€,... 


We denote by ¢ the empty list. 


List Type 


For any set S, denote by $* the set of all finite lists of elements in S. 


Length 


The length of a list 1 = (€9,€1,...), written |/|, is defined as 
| A n if lis finite and ends in e,,_, 


ox if lis infinite 


Head, Tail, Last, and Init 
If | = (eo, €1, €2,...) is nonempty, define 


head(l) 
tail(l) 


€o 


(€1, €2, oe .) 


[I> [| 


If furthermore / is finite and ends in e,_,, then define 


last(l) = e,-4 
init(l) =  (€9,€1,-++;€n—2) 
Concatenation 


Concatenation of two lists /; and ly, written J, “/. or sometimes /,/y, is defined when /, is finite. 
If f, = (€0,..-,€n—1) and ly = (€n,€n4i,---), then define 


a A 
l ly = (€05+++5€n—15€ns Ents ++) 
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List Construction 
Let J = {t,,%2,...} be a set of totally ordered elements with 7, < i2 <---. Then define 
((OMTETA PH) FS ee: 
where f is a function, P is a predicate, and 
a= { {0 0 
* E otherwise 
Indexing 
If 1 = (e0,€1,...), then define for all i with 0 <7 < |l| 
Ii] = e 
We let dom(1) denote the set of indices of any list /. Thus, 
dom(l) = {i|0<i< lj} 
We also let elems(l) be the set of elements in /. Thus, 
elems(l) = {I[i] |i € dom(1)} 
If / is nonempty, we denote by mazidzr(/) the maximum index in /. Thus, 


mavide(l) = |l|-1 


Restriction 


If /is alist and S is a set, we let / |S denote the restriction of / to S'. For example, (1,3, 2,5, 4) [ 
{2,3,4,7} = (38,2,4). Formally, 


Its 2 (i[i]| ie dom(2) Ali] € S) 


Set Operations on Lists 


As notational convention we allow set operators like €, C, etc., to operate on lists 1. This should 
just be thought of as a shorthand notation for the same operators operating on elems(1). For 
instance, e € 1 means e € elems(1) and 1 C S means elems(l) C S for some set S. 


A.5 Functions and Mappings 


We use the terms “function” and “mapping” synonymously. We use standard notation for 
function definition and application. When explicitly defining the mapping from elements to 
elements we use notation like 
[ 1 ne 1, 
2H 4, 
3H 9, 


9 81 | 
or equivalently 


ib Pl1<i<9 
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Function Type 
A function f mapping elements from $, to S» has the type 
5S, S» 
We shall only deal with total functions, i.e., f(s) is defined for all elements s € $,. 5) is referred 
to as the domain of f and S» as the codomain of f. 
Domain and Range 


For any function f, dom( f) denotes the domain of f. The range (or image) of f is defined as 
rng(f) = {fle) |e € dom(f)} 


Operations on Functions 


For function f: A— Band g:C — D with B CC, define the composition fog: A— D such 
that for all a € A, 


(fo g(a) = f(g(a)) 


For any function f : A — B and set $, denote by f\S the function with type (A\ 5) — B such 
that for alae A\ S, 


(f\ S)(a) = f(a) 
Similarly f [ S denotes the function of type (ANS) — B such that for allae ANS 
(fT S)(a) = f(a) 


For functions f;: A; + B;,1<i<k, with disjoint domains, denote by f,U---U f; the function 
of type (A; U---U Ay) = (B, U---U By) such that for all a € (A, U---U Ax) 


(fi Us+-U fx)(a) = fila) ifa € A; 


Appendix B 


Proofs from Part I 


B.1 Proofs in Chapter 3 


Proof of Lemma 3.1: 


Let @ be an arbitrary execution over (V,.A). 
If @ is infinite, then @ = a@ and the result trivially follows. 


Now, assume a is finite and let @ = 894@,5,4952---+d,5,. Furthermore, let 7 > 0 an arbitrary 
natural number. Let a; = ¢ and s; = s, for all i > nm. Then @ = 89a15,;d.5.---. We prove the 
lemma by structural induction over P. 


Base Case: P is a step formula 
(a,j) FP 
(by definition) 
(0<j<nand (s;,4;41, 5341) FE P) or 
(7 > n and (57,6, Sn) F P) 
iff (by definition of s; and a; for i > n) 

0 <j and (s;,4;41, 5341) F P 

(by definition) 
(a,j) FP 


iff 


iff 


Inductive Step: 


Assume as induction hypothesis that Q is a temporal formula over (Vg,Ag) such that for all 
ag over (Vg, Ag) and all jg < 0 


(ag,jag)FQ iff (aQ,je) F @ 


Assume a similar induction hypothesis for &. We consider the different possibilities for P (cf. 
Section 3.5). 


*P=0Q 


(a,j)/FO@ 
iff (by definition) 
(a j+NFQ 
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iff (by the induction hypothesis) 
(@,j+) FQ 
iff (by definition) 
(4,/)F OQ 
e P=QWR 


Similar to case P= OQ. 


e P=Vae:Q 
Since P is a temporal formula over (V,A), Q is a temporal formula over (V U {a}, A). 


(a,j) KV: Q 
iff (by definition) 

for all values v, (a7, j) EQ 
iff (by the induction hypothesis) 

for all values v, (a%,7) E Q 
iff (by definition of ~ and a®) 

for all values v, (@7,7) EF Q 
iff (by definition) 

(a,j) Eve: Q 


e P= dzr:Q 
Similar to case P = Va: Q. 


ee P=Q0=>R 
Similar to case P= OQ. 


e P=-Q 
Similar to case P= OQ. 


Proof of Lemma 3.2: 


This lemma holds for our temporal logic since we do not have any past operators, i.e., operators 
that can reference previous positions in an executions. For instance, some temporal logics (see, 
e.g., [MP92]) have a previous operator, which is dual to our next operator © and is defined 
such that previous P holds at position 7 in an execution if P holds at position 7 — 1 in that 
execution. Since our logic lacks this possibility of referencing previous positions, the question 
whether P holds at position 7 in a only depends on the suffix s;a;415;41-°-: of a, Le, ;jla. 
Similarly, the question whether P holds at position 7 in ;_;;a only depends on ;|(;_;|a), and 
since ;|(;-;;~) = ;|a, the result follows. 


Formally, the result can be proven by structural induction over P. 
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Proof of Lemma 3.3: 


Let a@ = 8901 51d28-++- and a’ = sja{s‘a)s,---. We define inductively a nondecreasing mapping 


m:N — N such that ;la ~ m(k)|Q’. Furthermore, for each & we define a mapping mg, : 
{0,...,m(k) — 1} — {0,...,4 — 1}, such that for all 0 < a’ < m(k), mnla ~ va’. This 
inductive definition is clearly sufficient to prove the lemma. 

Base Case: k = 0 

Define m(0) = 0. Then, by assumption, pla = a ~ a! = ,0){a’, as required. 


Let mp be the empty mapping. Then, vacuously, for all 0 < 2’ < m(0), mola & ve’ 


Inductive Step: 


Assume as induction hypothesis that ,a ~ j,,)|a and that, for all 0 <2’ < m(k), mnla & wa’. 
We consider cases. 


@ Oeqi = ¢. 


Define m(k + 1) = m(k). Then, clearly, py:]a ~ ¢]@ & mcr lO” = mag ya’. 


Define my41 = my. By the induction hypothesis and the definition of m(k +1) and mg41, 
for all O< 0 < mM(K+1), meyanla & va’. 


dpi =aF 6. 

Then, since ,|a ~ m¢)|e’ (induction hypothesis), there must be a unique number k’ > m(k) 
such that 5), (4)Gnce)¢1 Sim(e)41 1 Ge St! = SiaceySSmcey * 7 ES,- Thus, the first non-stuttering 
action in a’ after position m(k) must be a. 


Define m(k +1) = k’. Then the induction hypothesis, the definition of k’, and the case 
assumption imply ,4,|a ~ ¢|@ ~ mceyl@’ = meee lO’- 


Define mz41 to coincide with m, for all 0 < w’ < m(k), and define mz41(2’) = k, for 
all m(k) < a < m(k +1). Then the induction hypothesis and the definition of m;,41 
give, for all 0 < v < m(k), muyanla & va’. For m(k) < v@ < m(k +1) we have, 
megi(i) |& = ¢/Q ~ mela’ ~ la’, where the last stuttering-equivalence follows from the 
fact that ;,|a’ only differs from m(k)|@’ by having less stuttering in the start. 


This concludes the proof. 
a 


Proof of Proposition 3.4: 
Let @) = $1,941181,141,981,9°*+ and Gy = S294.) 89,149,989,9°++ be arbitrary executions such that 


ay, XY As. 
1. Let P be a state predicate. 
aLeF P 


iff (by definition) 
$1.0 E P 
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iff (since a, Y ay implies $19 = S29) 


$20 F P 
iff (by definition) 
Qo - P 


This proves that P is stuttering-insensitive. 


. Let P be a state transition predicate, and assume that (s,¢,s) F P (which implies 


(s,s)|P] = true) for all state s. 


aL eF- P 
iff (by definition) 
(51,0, 81,1). P] = true 
implies (since a, Y a» implies either (51,9, 81,1) = (S2,0, $2,1) oF (1,9, $1,1) = (52,0, 2,0)) 
(89,0, $21). P] = true or (52,9, $2.9) P] = true 
iff (since (529, $2,9)[P] = true by assumption) 
(S90, $2,1)[.P] = true 
iff (by definition) 
Qo - P 


A symmetric argument gives the implication in the other direction. This proves that P is 
stuttering-insensitive. 


. Let f be an action function. 


a, O(f) 
iff (by definition) 

there is a step (513, 4@1,:41, $1,241) in a, such that ayi41 € (513, S1540L/] 
iff (since ¢ can never be in the range of an action function) 

there is a step (513,141, $1,241) in a; such that a,i41 4 ¢ and ay ig) € (S15, 51541) L/] 
implies (by definition of ~) 


there is a step ($95, @2,341, $2,141) = (S17. 41,3415 41,441) IN @y such that 
doj41 € (S25, 52j740L/] 
iff (by definition) 


A symmetric argument gives the implication in the other direction. This proves that O(f) 
is stuttering-insensitive. 


Assume that P and Q are stuttering-insensitive temporal formulas. 
(a) PWQ 
ay r- PW Q 
iff (by definition) 
there exists a k > 0 such that (a,,k) E Q and for every 0< i <k, (a1,2) E P, 


or else, for all ¢ > 0, (a,,7) E P 
iff (by Lemma 3.1) 
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there exists a & > 0 such that (a1,k) E Q and for every 0 <i <k, (a1,i)- P, 
or else, for all 7 > 0, (a,,7) EF P 
iff (by Lemma 3.2) 
there exists a k > 0 such that ,|@; E Q and for every 0 <i<k, ;,|ja; E P, 
or else, for all i > 0, ;Ja; E P 
implies (by Lemma 3.3 and the fact that P and Q are stuttering-insensitive) 
there exists a k’ > 0 such that ,/|@2 —E Q and for every 0 < i! < k’, :|a5 & P, 
or else, for all i’ > 0, .|@3 E P 
iff (by Lemma 3.2) 
there exists a k’ > 0 such that (@3,k’) E Q and for every 0 <a < hk’, (@3,v) E P, 
or else, for all 2’ > 0, (@3, 2’) E P 
iff (by Lemma 3.1) 
there exists a k’ > 0 such that (a2,k’) E Q and for every 0 < 7’ < k’, (a2,v) EP, 
or else, for all 2’ > 0, (a2, 2’) E P 
iff (by definition) 
a=PWaQ 


A symmetric argument gives the implication in the other direction. This proves that 
PW Q is stuttering-insensitive. 


Va:iP 


Since a, © G2, we have, for all values v, (a,)% ~ (a2)%. 


a,EVae:P 

iff (by definition) 
for all values v, (a,)" E P 

iff (since P is stuttering-insensitive and (a,)% ~ (a2)*) 
for all values v, (a2)" — P 

iff (by definition) 


a, EE Va: P 


This proves that Va : P is stuttering-insensitive. 
da: P 

Similar to case Va: P 

aP 


a, EAP 

iff (by definition) 
a, -P 

iff (by the fact that P is stuttering-insensitive) 
a,kP 

iff (by definition) 


AQ. EAP 


This proves that 4P is stuttering-insensitive. 


P=>@Q 
Similar to case =P. 
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B.2 Proofs in Chapter 4 


B.2.1 Untimed Systems 


Proof of Lemma 4.1: 

Let (V,A) be an arbitrary pair with V’ C V and A’ C A and let a = soa, 5,a959--++ be an 
arbitrary execution over (V,A). Furthermore, let a’ = a [ (V’,A’) = shaisiais,--. Then 
arp if ap E A’ 

i = i d i = 

sh = se[V! and a, ¢ otherwise 

We prove the lemma by structural induction on P. 


Base Case: 


In the base case P is a step formula over (V’,.A’). We consider the two kinds of step formulas: 


e P=(f), where f is an action function over (V’, A’). 


a’,J) Ff) 
iff (by definition) 


0< 7 < |a’| and (85, 444,55 41) = (f)) or 


( 
( 
( 
‘ > Ja’| and ($)4156; Sfa) F (SY) 
( 
( 


iff (by definition and the fact that ¢ can never be in the range of an action function) 
0<j7 <la’l and ai,, € (s,s) ,,)[f]) or 
j > |a’| and false) 


(0 <j < |a’| and Ciaran € (si, s;. LFf]) 
iff (step 4; see below) 
(0< 7 < jal and aj41 € (5;,5;4:) Lf) 


iff 
(0< 7 < jal and aj41 € (5;, 5;41)Lf]) or 
(7 > |a| and false) 

iff (since ¢ can never be in the range of an action function) 
(0< 7 < jal and aj41 € (5;, 5;41)Lf]) or 
(j= lal and ¢ € (sja), Sa) E/T) 

iff (by definition) 
(0 <j < lal and (s;,4;43,8;41) E (f)) oF 
(7 2 lal and (sja),¢, 8a) F (F)) 

iff (by definition) 
(a9) Ff) 


Step 4 above is justified as follows: first, |a’| = |a| by definition of [. Next, since si = 
8) [((W,A), Shay = Sj41/(V',A’), and f is an action function over (V’,A’), we have that 
(si, 8 LA] = (5;,8;4.)Lf]. Finally, if aj,, = ¢, then aj4, ¢ A’ by definition of f, 
and since f is an action function over (V’,A’), we have ai,, € (si, si.,)[f]) iff aj4. € 
(sj, S;4.)[f]). If ai, #¢, then ai ,, = aj41. That suffices. 


e P is a state transition predicate over (V’, A’). 
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(a,j) FP 

iff (by definition) 
(0 <7 < Ja’| and (55,4541, 5;41) F P) or 
(7 = Ja’| and (814,60, S[a) FP) 

iff (by definition) 
(0 <j < ja’ and (s‘, 1, ,)[P] = true) or 
Gj 2 a’| and (S)ay> Stan EP = true) 

iff (step 3; see below) 
(0 <j < ja| and (s;,5;4,)[P] = true) or 
(7 = Jal and (8)4), a1) P] = true) 

iff (by definition) 
(0 <j < jal and (s;,a;41, 5;41) FE P) or 
Gj 2 a and (Sjajs¢s S}a}) F= P) 

iff (by definition) 
(a,j) EP 
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Step 3 is justified as follows: first, |a’| = |a|, by definition of [. Then, since P is a state 
transition predicate over (V’, A’) and si, = s;,[(V’, A’) for all k, the result directly follows. 


Inductive Step: 
Let Q be an arbitrary temporal formula over (Yo; Ag) and assume as induction hypothesis that 
for all pairs (Vg,Ag) with Vg © Vg and Ag © Aa, all executions ag over (Vg,Ag), and all 
JQ > 0, 


(a 


Q t (Vo, A@)s Ja) 


LQ iff 


(ag, JQ) 


EQ 


Assume a similar induction hypothesis for the temporal formula R over (Vp, A’). We consider 
the different possibilities for P (cf. Section 3.5). 


© P=0Q 


(VA), 
by by definition) 
VY, A); j 


NFO@ 


eP=QWR 


j) 


O 


+DEQ 


Similar to case P= CQ. 


e P=Vae:Q 


(af (VW, A), 


j) 


E Va: Q 


iff (by definition) 


for all values v, ((a 


PVA I) 
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iff (by definition of f and substitution) 
for all values v, (a® f (VU {x}, A’), 7) 
iff (by the induction hypothesis) 
for all values v, (a7, j) EQ 
iff (by definition) 
(a,j) Ve:Q 


I 
& 


e P= dzr:Q 
Similar to case P = Va: Q. 


e P=(Q=> Rk) 

Similar to case P= OQ. 
e P=-Q 

Similar to case P= OQ. 


Proof of Lemma 4.3: 


=>: Assume a[ A; — Q; for all 7. Then since a[ A; ~ a [ A; and Q; is stuttering-insensitive, we 
have a} A; = Q,;, for all «. Then by Lemma 4.2, a — Q,, for all 7, and thusa EQ, A...A Qn. 


<=: Assume a F Q; A... A Qn. Then a & Q;, for all i, and Lemma 4.2 implies that 
af A; E Q;, for all 7. Again, since a[ A; ~ af A; and Q; is stuttering-insensitive, it follows that 


a[A; — Q;, for all 7. 
| 


Proof of Proposition 4.4: 


By Definition 2.9 we have L = {a € exec(A) | a[A, € I1,...,a/Ay € Ly}. By definition of 
| we know that if a € exec(A), then a[ A; € exec(A;), for all i. Thus, since L; is induced by 
Qi, we get L = {a € ewec(A) | afA; F Qi,...,afAw FE Qn}. By Lemma 4.3 we finally get 
EL = {a € ervec(A)|a- Qi A...A Qn} which proves that L is induced by Qi A...A Qn. 


Proof of Proposition 4.5: 

Let (Au, 24) = (A, L£)\ A. The proof is trivial since, by Definitions 2.3 and 2.10, exec(A,) = 
exec(A) and Ly = L. 

| 


Proof of Proposition 4.6: 
Let (A,,L,) = p((A, £)). By Definition 2.11 we have (A,, L,) = (p(A), {p(a) | a € L}). 


First note that since Q is a temporal formula over A, Definition 2.4 implies that p(Q) is a 
temporal formula over A,. 

Now, it is clear that a — Q iff p(a) F p(Q). Since also exec(A,) = {p(a) | a € exec(A)}, it 
follows that L, = {a € exec(A,) | a F p(Q)}, which proves that L, is induced by p(Q). 
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B.2.2 Timed Systems 


Proof of Proposition 4.17: 


Let L;,,, for each 1 <2 < N, be a sampling characterization of £; such that L,, is induced by 
Q;. We have 


L {= € t-evec™(A)| SJA, € Iy,...,u[An € Ly} 
{hu € t-exec™(A) | (Va; samples /[A;: a; € L1,.),..., 
(Vay samples N/Ay : a; € Ly)} 


{hu € t-exec®(A) | Va samples %: a[ A, € L1,,...,a/ An € Lys} 


Ilo []- 


[| 


where Step 1 follows from Definition 2.26, Step 2 follows from the definition of sampling char- 
acterizations, and Step 3 follows from Lemma 4.15 Part 3. 

This proves (using Lemma 4.13 Part 2) that LD is induced by L, = {a € exec™(A) | a[ Ai € 
Iy.5,...,0[/An € Ly}, and we have 


L, {a € exec*(A) | a[Ai EF Qi,...,a[/An F Qn} 


{a € exec*(A)|aAE Qi A---A Qn} 


i 
2 


where Step | follows from the definition of sampling characterization being induced by temporal 
formulas and Step 2 follows from Lemma 4.16. 


This proves that LE, and, in turn, 2 are induced by Q; A...A Qn. 
a 


Proof of Proposition 4.18: 

Let (Ay, L4) = (A, L£)\ A. The proof is trivial since, by Definitions 2.19 and 2.27, exec(A,) = 
exec( A), t-exec(A,) = t-evec(A), and L = Ly. 

| 


Proof of Proposition 4.19: 


Let (A,, £,) = p((A, £)) and let L, be a sampling characterization of £ such that L, is induced 
by Q. By Definition 2.28 we have (A,, L,) = (p(A), {e(%) | & € L}). 

First note that since Q is a temporal formula over A, Definition 2.20 implies that p(Q) is a 
temporal formula over A,. 

Now, it is clear that exec(A,) = {pa | a € exec(A)} and that a — Q iff p(a) F p(Q). Thus, 
Lys = {p(a) | a € L,} is induced by p(Q). Since also t-exec(A,) = {ph | & € t-exec(A)} and a 
samples \ iff p(a) samples p(X“), we immediately get that L, is induced by L,,. That suffices. 


B.2.3. Embedding 
Proof of Lemma 4.21: 


Since Q is a temporal formula over A, a is an execution over A,, variables(A) C variables(A,), 
and acts(A) C acts(A,), Lemma 4.1 yields 


(a f (variables( A), acts(A))) F Q iff aEQ (*) 


250 B. Proofs from Part I 


Furthermore, by definition of untime(a) we have untime(a) ~ (a [ (variables( A), acts(A))), and 
since Q is stuttering-insensitive we have 


untime(a) FE Q iff (af (variables( A), acts(A))) F Q (+) 
Then (*) and (**) imply the result. 
| 


Proof of Proposition 4.22: 


First note that since variables(A) C variables(A,) and acts(A) C acts(A,), Q is a temporal 
formula over A,. We have 


L, = {% € t-exec®(A,) | untime(X) € L} 
2 {¥ € t-evec™(A,) | untime(S) — Q} 
= {¥ € t-evec®(A,) | for all a, if a samples N, then untime(a) K Q} 
4 {hu € texec®(A,) | for all a, if a samples U, then a — Q} 


where Step 1 follows from Definition 2.35, Step 2 follows from the fact that £ is induced by 
@ (and untime(%) € exec(A) by definition of untime), Step 3 follows Lemma 4.20, and Step 4 
follows from Lemma 4.21. 

This proves, by Lemma 4.13 Part 2, that L, is induced by Q. 


We show that @ is minimal. Thus, for arbitrary admissible execution a of A, with a —K Q, we 
must show the existence of a timed execution © € L, such that a samples %. 

Let a be an arbitrary admissible execution a of A, such that a —- Q. Let © be a timed 
execution of A, such that a samples &. By Lemmas 4.11 and 4.13 © exists and is admissible. By 
Lemma 4.20 untime(a) = untime(%) and Lemma 4.21 gives untime(a) F Q. Thus, untime(%) — 
@, which implies untime(X) € L. Then, by definition of L, (Definition 2.35), € L,. That 
suffices. 


B.3 Proofs in Chapter 5 


B.3.1 Untimed Systems 
Proof of Lemma 5.10: 


Let m be an arbitrary index mapping from a to a’ with respect to R. 


=>: Assume a — ©OO-7(C). Then, by Lemma 3.5 Part 3, there exists an index 7? such that 
ila —F O-7(C). Thus, no actions in C occur in trace(;|a). By Lemma 5.6 and the fact that C 
contains external actions only, no actions in C’ occur in the suffix ,,(;)|a’. Thus, ,,¢;)|a’ F G7(C), 
which, by Lemma 3.5 Part 4, implies that a’ F OO-=(C). That suffices. 


<=: Assume a’ —F OO-7(C). Then, by Lemma 3.5 Part 3, there exists an index j such that 
jlo’ — O-(C). Now, by Condition 4 of Definition 5.4, there exists an i < |a| such that m(%) > 7. 
Then ,,¢i)|a” is a suffix of ;|a’, and consequently, by Lemma 3.5 Part 1, ,(i)|a° F O7(C). 

Thus, no actions in C occur in trace(;)|0"). By Lemma 5.6 and the fact that C’ contains 
external actions only, no actions in C' occur in the suffix ;,ja. Thus, ;,;a - O-7(C), which, by 
Lemma 3.5 Part 4, implies that a F OO-7(C). That suffices. 
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Proof of Lemma 5.11: 


Let m be an arbitrary index mapping from a to a’ with respect to R. 


Assume a’ — OO. Then, by Lemma 3.5 Part 3, there exists an index j such that ;|a’ F OQ. 
Thus, for each state u in ;|a’, we have u — Q. Now, by Condition 4 of Definition 5.4 and the 
fact that m is nondecreasing, we get the existence of an index i? such that for all i < k& < Jal, 
m(k) > 7. Then, for each state s of a with index k (i < k < Jal) we have s — P since (by 
Condition 2 of Definition 5.4) there exists some u in ;|a’ such that (s,u) € R. 

This gives us, for all k > 0, (j|Ja,&) — P. (Even if ;Ja is finite this is true since P holds in 
the stuttering step that stutters the last state since it holds in the last state.). Thus, ;Ja E OP, 
which finally, by Lemma 3.5 Part 4, a FE OOP. 


Proof of Lemma 5.13: 


1. Let a@ = 59418, a98,-++. Let so € start(A;,) be such that spo [ variables(A) = so. Define 
Qno = Spo. Then ajo f (variables( A), acts(A)) = so. 


Define ap, inductively as follows. Assume Qain—1) = Sho@1$p1d2Sp2-+-Gn—1Sa(n—1) 1S an 
execution of A, such that aj,~1) }(variables( A), acts(A)) = a|,_,. Then, by Lemma 5.12 
Part 1, there exists a step (Sp(n—1), ns San) € Steps( Aj). 

Define pn = $7041 84142Sp2 «-~Gn—18h(n—1)4nSan- Then a, | (variables( A), acts(A)) = al). 


Then, a, = lim,—ja| @_zn has the required property. 


2. Directly from Lemma 5.12 Part 2. 
a 


Proof of Lemma 5.14: 


A Cg A;: Let @ € traces( A) and let a € exec(A) be such that trace(a) = 9. By Lemma 5.13 
Part 1 there exists an execution a; € exec(A;,) such that a, [(variables(A), acts(A)) = a. Then, 
since ext(A) = ext(A;), we have trace(a;,) = trace(a) = 3. Thus, § € traces( A, ). That suffices. 


A; Cg A: Let § € traces(A;,) and let a, € exec( A,) be such that trace(a,) = 9. By Lemma 5.13 
Part 2, a, [ A € exec(A). Then, since ext(A) = eat(A;), we have trace(a;,) = trace(a, [ A) = 9. 
Thus, 8 € traces( A). That suffices. 


Proof of Lemma 5.15: 
(A, LE) Cy (An, £,): Let 6 € traces(L) and let a € LE be such that trace(a) = 3. By Lemma 5.13 
Part 1 there exists an execution a, € exec(A;,) such that a, | A =a. Thus, by definition of Dy, 


we have a, € L;, and since ext( A) = ext(A;,) we finally get trace(a;,) = trace(a) = G, and thus, 
6 € traces(L,,). That suffices. 


(An, £,) Ep (A, £): Let 6 € traces(L;,) and let a, € EL, be such that trace(a,) = GB. By 
definition of L,, a, [A € L. Then, since ext(A) = ext(A;,), we have trace(a;,) = trace(a, [ A) = 
B. Thus, 6 € L. That suffices. 
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Proof of Lemma 5.16: 

We have 
Lh, {ap, € exec(A;,) | an AE Q} 

{ap, € exec(A;,) | an EF Q} 

where the first equality follows from the definition of L, and Lemma 5.13 Part 2, and the last 

equality follows from Lemma 4.1. This shows that Ly, is induced by Q. 


B.3.2. Timed Systems 


Proof of Lemma 5.28: 


1. Let © = woajw dw.---. Define ho to be a value of h such that (fstate(w) U [h & hol) € 
start(A;,). Define, for all t € dom(wo), wao(t) f variables( A) = wo(t) and wao(t).h = ho. 
Then wag is a trajectory of A,. 


Now we define w,,, inductively. By the properties of timed executions, (Wp_1,@n,Wn) © 
steps(A). Then by Lemma 5.27 Part 1 where exists a value h, such that (w,_1 U[h 
Rn—1],4n,@n U [Rh + Ry]) € steps(A;,). Then, for all t € dom(w,), define wpn(t) f 
variables( A) = wo(t) and Wyn (t).h = hy. 


Then, Yj, = 94171 @2Wp2 +++ is a timed execution of A, and Y), | variables(A) = %. 


2. Directly from Lemma 5.27 Part 2. 
a 


Proof of Lemma 5.32: 
Let L, be a sampling characterization of EF such that £, is induced by Q and define 
Lis = fa, € exec™(A;,) | an [ A € Dg} 
Similar to the proof of Lemma 5.16 it is easy to see that Lp, is induced by Q. It now suffices 


to show that L,, is induced by L,,,. We must check two conditions. 


1. Assume X, € L,. We must show that for all a, that samples Up,, a, € Lp,,. So, assume 
a, samples %,. Since %, is admissible, also a; is admissible by Lemma 4.13. Thus, by 
definition of L,,, it suffices to show that a, [ A € Lg. 


Since My, € Ly, we have %, [A € EL. Lemma 5.31 Part 1 gives a, | A samples ©, | A. Then 
a, [Ae L, since L, is a sampling characterization of L. That suffices. 


2. Assume %;,, € t-exec®(A;,,) and for all a, samples %;,, a, € Ly. We must show that 
Xn € Ly. By definition of L, it suffices to show that %, | Ae L. 


Let @ be an arbitrary execution of A such that a samples /, [ A. Then Lemma 5.31 Part 
2 gives the existence of an execution a, of A, such that a = a, [ A and a, samples Uy. 
Thus, the assumption for this case implies a, € L,,,. By Lemma 4.13 a, is admissible. 
Then the definition of L,,, implies that a € L,. Since a was arbitrary, the definition of 
sampling characterization implies that &, [ A € L. That suffices. 


That concludes the proof. 
a 


Appendix C 


Invariance Proofs 


In this chapter we prove the invariants stated in the G and C specifications. We use the normal 
proof technique: 


e Show that the invariant is satisfied in every initial state. 


e Assume the invariant and all previously proved invariants hold in a state s, and for all 
steps (s,a,s’) show that the invariant holds in s’. 


Many of the invariants consist of several parts. We prove that the conjunction of these parts is 
an invariant. It follows that each conjunct (part) is then itself an invariant. All the parts of the 
invariants are of the form 


If C then P 


where, in some cases, C = true. For the sake of brevity we consider only, in the second part 
of the proof technique above, the steps that can change C' from false to true or make P false 
while C is true since these are the only steps that might invalidate the invariant. We refer to 
such steps as the critical steps for the invariant (part). 


C.1 Proof of Invariants at the G Level 


Proof of Invariant 8.1 
e Since mode, = idle in the initial states of G, it follows that both parts of the invariant 
are satisfied in the initial states. 
e We now consider the two parts separately 
1. We consider the critical steps. (Note that none of the steps in G can remove elements 
from used, ) 
a = choose_id(id,m) 
This step changes mode, to send but at the same time the new value of last, is 
appended to the end of used,, so Part 1 holds after the step. 
a € {receive_pkt,.,(id, b), recover, } 


Both of these steps can change last, but at the same time mode, is changed to non- 
send, so Part | holds after the steps. 
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2. 


C. Invariance Proofs 


The proof of this part follows directly from the proof of Part 1 and the fact the used, 
is a queue of /Ds. (Remember that nil ¢ ID). 


Proof of Invariant 8.2 


e Since mode, = idle and used, = < in initial states of G, both parts of the invariant hold 
in the initial states. 


e We assume that both parts hold in state s. For each part we consider the critical steps of 
the form (s, a, s‘). 


1. 


a = prepare 
This step changes mode, to needid but at the same time good, is changed to 9, so 
Part 1 holds in s’. 


a = choose_id(id,m) 

This step adds an zd to used, but at the same time mode, is changed to send, so Part 
1 holds in s’. 

a = grow_good , (ids) 

We consider this case when s.mode, = needid. The step adds identifiers to used, but 
since s.mode, = needid the step can only add ids that do not intersect with s.used,. 
Thus, since Part 1 is assumed to hold in s, it also holds in s’. 

a = choose_id(id,m) 

This step adds the element id from s.good, to used, but since s.mode, = needid, the 
assumption that Part 1 holds in s gives us that id ¢ s.used,. Thus Part 2 holds in s’. 


Proof of Invariant 8.3 


e Initially mode, = idle so the invariant holds. 


e Assume that the invariant holds in s. We now consider all the critical steps of the form 


(s,a, 8’). 


1. 


a = receive_pkt ,.(m, id) 


If this step changes mode, to rcvd, it also adds an element to buf,, so Part 1 holds in 


s\. 


a = receive_msg(m) 


This step can make buf, empty, but in this case, mode, is changed to ack, so Part 1 
holds in s’. 


Proof of Invariant 8.4 


e Part 1 holds initially because mode, = idle. issued, is initially a superset of good, thus 
satisfying Part 2. For Parts 3, 4, 5, and 6 the sets that are supposed to be subsets are 
initially empty, so the result follows. Since last, is initially nil, Parts 7 and 8 are also 
satisfied. 
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C.1 


e For each part of the invariant we consider the critical steps (s, a, s’), where we assume that 
all parts of this invariant hold in s, and that previously proved invariants hold in both s 
and s’. (For Parts 1, 2, and 3, note that issued, can never shrink, and for Parts 4, 5, 6, 
and 8, note that used, can never shrink.) 


1. 


a = prepare 


This step changes mode, to needid, but at the same time good, is made empty, so 
Part 1 holds in s’. 


a = recover, 


This step changes mode, from rec to nonrec (idle) but at the same time issued, is 
changed to some superset of good,, so Part 1 holds in s’. 


a = grow_good , (ids) 


We consider the case where s.mode, = needid and s.mode, # rec. The step adds 
some elements to good,, but in the case we consider, the elements that are added are 
all in s.zssued,. So, since we assume Part 1 holds in s, it also holds in s’. 


a = grow_good,,(ids) 


This step adds elements to good, but at the same time the same elements are added 
to issued,. So, since we assume that Part 2 holds in s, it also holds in s’. 


a = recover, 


This step changes mode, from rec to non-rec, but at the same time issued, is changed 
to some superset of used,, so Part 3 holds in s’. 


a = prepare 


Consider this step when s.mode, # rec. We add an element id from s.good, to used,. 
From Part 1 we get that id belongs to s.issued, so adding id to used, does not violate 
Part 3. 


In the proof, we let id-set denote the set ids(sr) U (if mode, = send then {last,}) in 
any state of G. 
a = choose_id(id,m) 


This step changes mode, to send so s’.last, gets added to id-set, but from Invariant 8.1 
Part 1 we get that s’.last, € s’.used,, so Part 4 is not violated. 


a = send_pkt ,.(m, id) 


This step might add a packet to the channel (sr), but since a precondition for the step 
is s.mode, = send, the id on the packet is already in id-set, thus this step does not 
change id-set. So, since Part 4 holds in s, it also holds in s’. 


a = receive_pkt,,.(m, id) 


This is the only step that may add an identifier to nack-buf ,. The identifier id added 
is in ids(s.sr), so since we assume that Part 4 holds in state s we get that id € s.used,, 
so Part 5 is not violated. 
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a = send_pkt,.,(id, true) 


This step can add a packet with identifier s./ast to the return channel rs. The action 
is only possible if s.last, € ID, i.e., if s.dast, # nil. But then Part 8 gives us that 
s.last, € s.used,, thus this step cannot violate Part 6. 


a = send_pkt,.,(id, false) 

This step can add a packet with an identifier from s.nack-buf to rs. From Part 5 in 
state s we get that this identifier is in s.used,, so the step cannot violate Part 6. 

a = receive_pkt,.(m, id) 


This step can change last, to id which belongs to s.good,.. However, at the same time 
id is removed from good,. It remains to be shown that id ¢ s’.issued,. Since we 
assume that all parts of this invariant hold in s, Part 2 gives us that id € s.issued, 
and since issued, is not changed in the step, we get id € s’.issued,. The result follows 
directly. 


a = recover, 

This step changes last, to nil. But since good-ids is a set of elements from JD and 
nil ¢ ID, Part 7 holds in state s’. 

a = grow_good,,(ids) 

This step does not change good-ids, so Part 7 holds in state s’. 


a = receive_pkt,.(m, id) 


This is the only step that can change last, to non-nil. last, is changed to an identifier 
id in a packet in s’.sr. From Part 4 in state s we get that id € s.used,, so since used, 
does not change in the step, Part 8 holds in state s’. 


Proof of Invariant 8.5 


e Initially sr = @ and mode, # send, so the invariant holds. 


e We consider the critical steps (s,a,s’), where we assume that this invariant hold in s, 
and that previously proved invariants hold in both s and s’. Note that no step can 
change current-msg, and end up in a state with mode, = send. Also, no step, except 
choose_id(id,m) can change last, and end up in a state with mode, = send. 


1. 


a = choose_id(id,m) 


This step changes mode, from needid to send. From Invariant 8.4 Part 4 we get that 
s.used, D ids(s.sr). From Invariant 8.2 Part 1 and the definition of choose_id(id,m) 


we then get that s’.last, ¢ ids(s’.sr), so this step does not invalidate the invariant. 


Proof of Invariant 8.6 


e Initially current-ok = false, so all parts of the invariant hold. 
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e For each part of the invariant we consider the critical steps (s, a, s’), where we assume that 
all parts of this invariant hold in s, and that previously proved invariants hold in both 
s and s’. Note, for the Parts 3, 4, 6, and 7, that no step, except choose_id(id,m), can 
change last, without also changing mode, to something other that send. 


1. 


a = prepare 


This changes current-ok to true if s.mode, # rec, but at the same time mode, is 
changed to needid, so Part 1 holds in state s’. 


a = receive_pkt,.,(id, b) 

In order for this step to change mode, to idle, we must have s.mode, = send and 
(s.last,,b) € s.rs. In that case the step can only violate Part 1 if s.current-ok = true, 
but this cannot be the case since we assume that Part 4 holds in state s. Thus, the 
step cannot violate Part 1. 


a = crash, 


This step can change mode, from needid or send to rec, but at the same time 
current-ok is set to false, so Part 4 holds in state s 


a = prepare 


This step changes current-ok to true, but only if mode, # rec, so Part 2 holds in s. 


a = crash, 


This is the only step that can change mode, from non-rec to rec but at the same 
time current-ok is made false, so Part 2 holds in s. 


a = choose_id(id,m) 


This is the only step that can change the condition in Part 2 from false to true. This 
happens if s.current-ok = true. Since s.mode, = needid, Part 5 which we assume 
holds in s gives us that s’.last, € s.good, which again implies that s’.last, € s'.good,. 
From Invariant 8.4 Part 7 we get that s’.last, ¢ s’.good,. Thus s’.last, # s‘.last,, so 
Part 3 holds in s’. 


a = receive_pkt,,.(m, id) 


This step can make s’.last, = s'.Jast, but in this case curremt-ok is changed to false, 
so Part 3 holds in s’. 


a = recover, 


Consider this step when mode, = send and current-ok = true. The step changes last, 
to nil but from Invariant 8.1 Part 2 we have s’.mode, # nil, so Part 3 holds in s’. 


a = choose_id(id,m) 


This is the only step that can change the condition in Part 2 from false to true. This 
happens if s.current-ok = true, so assume this. In state s we get from Invariant 8.4 
Part 6 that all ids on s.rs arein s.used,. From Invariant 8.2 Part 1 we get that s’.last, ¢ 
s.used,. Since rs is not changed in the step, we finally conclude that (s’.last,,b) ¢ s'.rs, 
so Part 4 holds in state s’. 
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a = send_pkt,.,(id, true) 
Consider this action when mode, = send and current-ok = true. (s.last,, true) might 
be added to rs, but from Part 3 we get that Part 4 is not violated. 


a = send_pkt,.,(id, false) 

Consider this action when mode, = send and current-ok = true. A packet with an id 
from s.nack-buf, might be added to rs, but from Part 7 (which we assume holds in s) 
we get that Part 4 is not violated. 


a = prepare 
This step can make current-ok = true and mode, = needid but at the same time 
good, is made empty, so Part 5 holds in state s’. 


a = grow_good , (ids) 
This step can only add elements from good, to good, when current-ok = true and 
mode, = needid, so Part 5 holds in state s’. 


a = shrink_good,( ids) 
This step can only remove elements not in good, from good, when current-ok = true 
and mode, = needid, so Part 5 holds in state s’. 


a = choose_id(id,m) 
Consider this step when s.current-ok = true. The step changes mode, to send and 


changes last, to a value from s.good,. Since s.mode, = needid, Part 5 gives us that 
8'.last, € s.good,, so since good, is not changed in the step, Part 6 holds in s’. 


a = shrink_good,( ids) 
When current-ok = true and mode, = send, this step cannot remove s.last, from 
good,,, so Part 6 holds in s’. 


a = choose_id(id,m) 

Consider this step when s.current-ok = true. The step changes mode, to send and 
changes last, to a value from s.good,. Since s.mode, = needid, Invariant 8.2 Part 1 
gives us that s’.last, ¢ s.used,. From Invariant 8.4 Part 5 we then get that s’.last, ¢ 
s.nack-buf, which again implies s’.last, ¢ s’.nack-buf,. since nack-buf,, is not changed 
in the step. So, Part 7 holds in state s’. 


a = receive_pkt,.(m, id) 

This step can add an identifier to nack-buf,. Assume s.current-ok = true and 
s.mode, = send. We must show that s.last, (= s’.last,) cannot be added to nack-buf,, 
under these assumptions. From Part 6 we have that that s.last, € s.good,, so from the 
definition of receive_pkt,.(m, id) we get that nack-buf,. is not changed. Thus, Part 7 
holds in state s’. 


Proof of Invariant 8.7 


Parts 1 and 2 are reformulations of Invariant 8.6 Parts 3 resp. 4. 
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Proof of Invariant 8.8 


e Since initially mode, = idle and current-ack, = false, all parts hold. 


e For each part of the invariant we consider the critical steps (s, a, s’), where we assume that 
all parts of this invariant hold in s, and that previously proved invariants hold in both s 
and s’. Note, for the Parts 1, 2, and 3 that no step, except choose_id(id,m), can change 
last, without also changing mode, to something other that send. Note also that no steps 
can make good-ids grow. good-ids can only shrink. 


1. 


a = choose_id(id,m) 
This step changes mode, to send. In state s we get from Invariant 8.4 Part 4 that 
s.used, D ids(sr). From the definition of choose_id(id,m) we see that s’.last, is placed 


at the end of used,, thus by the definition of the partial order of identifiers we see that 
Part 1 holds in s’. 


a = send_pkt ,.(m, id) 


This step might add (m, s.last,) to sr while mode, = send. But since Part 1 is assumed 
to hold in s, it is obvious that it also holds in s’. 


a = choose_id(id,m) 

Although this step changes mode, from needid to send, it does not make last, = last,.. 
To see why this is so, note that either s’.Jast, = nil in which case the result follows 
directly (since s’.dast, # nil by Invariant 8.1 Part 1) or s’.last, = s.last, # nil in 
which case Invariant 8.4 Part 8 implies that s’.last, € s.used, and Invariant 8.2 Part 1 
implies that s’ last, ¢ s.used,, so again the result follows. Thus, Part 2 holds in s’. 


a = receive_pkt,,.(m, id) 

Consider the case where s.mode, = s'.mode, = send, id = s.last, = s'.last, € s.good,., 
and s.mode, = s'.mode, # rec. In this case we get s’.last, = s'.last,. We must 
show that ({s'.last,} U ids(s'.sr)) 9 s'.good-ids = ). From Invariant 8.4 Parts 3 and 
4 we get that s’.issued, D {s'.last,} U ids(s’.sr). So what remains to be shown is 
that ({s'.last,} U ids(s’.sr)) 1 s.good, = @. From Part 1 we get that id > ({s.last,}U 
ids(s.sr)). Since we remove all identifiers less than or equal to id from good, in this 
step, and since Invariant 8.4 Part 4 ensures that all packets in sr have identifiers that 
are related to id, the result follows. Thus, Part 2 holds in s’. 


a = send_pkt ,.(m, id) 


This step can change sr, but only with a packet with the identifier s.last,. Since we 
assume that this Part 2 holds in s, it follows that it also holds in s’. 


a = choose_id(id,m) 

Although this step changes mode, from needid to send, it does not make the packet 
(s'.last,, true) belong to s’.rs. We show why this is so. Since rs is not changed in the 
step, we get from Invariant 8.4 Part 6 that s.used, D ids(s’.rs). Invariant 8.2 Part 1 
together with the definition of choose_id(id,m) gives us s’.last, ¢ s.used,. Thus we 
get s’.last, ¢ ids(s'.rs) which gives the result. So, Part 3 holds in s’. 
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a = send_pkt,.,(id, true) 

Consider this step while s.mode, = s'.mode, = send and id = s.last, = s.last, = 
s'.last,. The step might succeed in putting the packet (s’.last,, true) into the channel. 
We show that ({s’.last,} U ids(s’.sr))  s’.good-ids = §. From Part 2 we get that 
({s.last,}Uids(s.sr))s.good-ids = ). Since neither last,, sr, nor good-ids are changed 
in the step, the result follows directly. So, Part 3 holds in s’. 


a = send_pkt ,,(m, id) 

This step can change sr, but only with a packet with the identifier s.last,. Since we 
assume that this Part 2 holds in s, it follows that it also holds in s’. 

a = receive_pkt,.,(id, b) 


This step can change mode, to idle and current-ack, to true if b = true and id = 
s.last,, thus, (s.last,, true) must be on s.rs. Then Part 3 implies that ids(s.sr) 
good-ids = ). It now directly follows that Part 4 holds in state s’. 


Proof of Invariant 8.9 


e Since initially buf, = ¢, all parts of the invariant hold. 


e For each part of the invariant we consider the critical steps (s, a, 5’), where we assume that 


all parts of this invariant hold in s, and that previously proved invariants hold in both s 
and s’. 


a = recover, 


This step changes mode, to idle but at the same time buf, is made empty, so Part 1 
holds in s’. 


a = send_pkt,.,(id, true) 


This step can change mode, to idle, but from Part 2 in state s we get buf, = ©, so 
Part 1 holds in s’. 


a = cleanup, 


This step changes mode, to idle but since s.mode, € {idle, ack} from the precondi- 
tion, this part and Part 2 imply that buf, was already empty. Thus, Part 1 holds in 


s\. 


a = receive_pkt,.(m, id) 


We consider this step in two different situations 
— The step can make buf, nonempty but at the same time mode, is changed to 
revd. 
— The step can change mode, from idle to ack, but then Part 1 implies that buf, 


was already false. 


So, Part 2 holds in state s’. 
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a = receive_msg(m) 


This step can change mode, to ack but this only happens if s’.buf, = ¢, so Part 2 
holds in state s’. 


a = choose_id(id,m) 


Although this step makes mode, = send, it does not make the packet (s’.last,, true) 
belong to s’.rs. The argument is the same as for the corresponding case in the proof 
of Invariant 8.8 Part 3. So, Part 3 holds in state s’. 


a = send_pkt (id, true) 


This step can put (s’.last,, true) into rs but since s.mode, = ack, Part 2 gives us that 
s.buf,.(= s’.buf,,) = ¢. So, Part 3 holds in state s’. 


a = receive_pkt,,.(m, id) 


This step might add an element to buf,. We show that this cannot happen while 
mode, = send and (last,, true) € rs. If an element is added to buf, in the step, then 
id € s.good,, i.e., ids(s.sr) U s.good-ids # Q but this contradicts Invariant 8.8 Part 3. 
So, Part 3 holds in state s’. 


a = receive_pkt,.,(id, true) 


Consider this step when id = s.last,. Then (s.last,, true) € s.rs. Since s.mode, = 
send, Part 3 implies that s.buf, = ¢ which in turn implies that s’.buf, = ¢. So, Part 4 
holds in state s’. 


a = receive_pkt,,.(m, id) 


This step might add an element to buf,. The argument that this cannot happen while 
mode, = idle and current-ack, = true is similar to the argument in the corresponding 
case in the proof of Part 3, only in this case we get a contradiction with Invariant 8.8 
Part 4. So, Part 4 holds in state s’. 


Proof of Invariant 8.10 


e Initially nack-buf,, = ¢ and rs = 9, so the parts hold. 


e For each part of the invariant we consider the critical steps (s, a, s’), where we assume that 
all parts of this invariant hold in s, and that previously proved invariants hold in both s 
and s’. Note, that no steps can make good-ids grow. 


1. 


a = receive_pkt,,.(m, id) 


Consider this step when s.mode, # rec and id ¢ s.good,. Then id might be added to 
nack-buf,.. Since id ¢ s.good, and good, is unchanged in the step we get s’.nack-buf,. 
s'.good, = () (since we assume that this Part 1 holds in s). From Invariant 8.4 Parts 3 
and 5 it follows that s’.nack-buf,, 0 s'.issued, = 9. So, Part 1 holds in state s’. 


a = send_pkt (id, true) 


This step might add (last,,true) to rs but from Invariant 8.4 Part 7 we get that 
last, ¢ good-ids, so this step cannot violate Part 2. 
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a = send_pkt,.,(id, false) 


Then id € s.nack-buf,, so Part 1 directly gives us that this step cannot violate Part 2. 


Proof of Invariant 8.11 


e Initially mode, = idle so both parts hold. 


e For each part of the invariant we consider the critical steps (s, a, 5’), where we assume that 


both parts of this invariant hold in s, and that previously proved invariants hold in both s 
and s’. Note, no action, except choose_id(id,m), can change last, without also changing 
mode, to non-send. Also, from Invariant 8.1 Part 2 we get that all steps that change last, 
to nil are not critical. 


a = choose_id(id,m) 


Although this step changes mode, to send, it does not make the packet s’.last, belong 
to s’.nack-buf,. We show why this is so. Invariant 8.2 Part 1 implies that s’ last, ¢ 
s.used,. From Invariant 8.4 Part 5 and the fact that nack-buf, is not changed in the 
step, we get that s.used, D s'.nack-buf,, which gives the result. So, Part 1 holds in 
state s’. 


a = receive_pkt(m, id) 


We consider two cases. 


— Consider the step when id = last,. Then last, can be added to nack-buf, but this 
can only happen if last, 4 last,, so Part 1 is not violated. 

— Consider the step when s.mode, # rec, id = last,, and last, € s.good,. Then 
s'.last, = s'.last,. We show that then s.last, ¢ s.nack-buf, (which is the same as 
showing s’.last, ¢ s’.nack-buf,). First assume s.last, € s.nack-buf,. Then Invari- 
ant 8.10 Part 1 implies that s.last, ¢ s.good,, but this contradicts the assumption 
that last, € s.good,. Thus, Part 1 holds in state s’. 


a = choose_id(id,m) 
Although this step changes mode, to send, it does not make the packet (s’.last,, false) 


belong to s’.rs. The argument that this is so is similar to the argument in the corre- 
sponding case in the proof of Invariant 8.8 Part 3. So, Part 2 holds in state s’. 


a = send_pkt,.,(id, false) 
Consider this step when id = last,, i.e., last, is first on s.nack-buf,. Then Part 1 


implies that s.last, # s.last,, so, since neither last, nor last, change in the step, 
Part 2 holds in state s’. 


a = receive_pkt,.(m, id) 

Assume s.mode, # rec and last, = id € s.good,. Then s’.last, = s’.last,. We show 
that then (last,, false) ¢ rs. First assume (last,, false) € rs. Then Invariant 8.10 
Part 2 implies that last, ¢ s.good-ids, but this contradicts the assumption that last, € 
s.good,. Thus, Part 2 holds in state s’. 
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Proof of Invariant 8.12 
e The invariant is explicitly required to hold in all start states. 


e We consider the critical steps (s,a,s), where we assume that the invariant holds in s’, and 
that previously proved invariants hold in both s’ and s. 


1. a@= recover, or a = shrink_good,.(ids) 


These steps explicitly require the invariant to hold in s. 


C.2 Proof of Invariants at the C Level 


In this section we prove the invariants of Ab presented in Section 10.5.2. As above we prove 
the invariants by induction, proving that they hold in the (unique) start state and that all steps 
preserve the invariants. As above, in the inductive step of the inductive arguments we only 
consider “critical steps” that might invalidate the invariant. 

In the proofs the steps have the form (s, a, s’). 


Proof of Invariant 10.1 


e Initially all the involved variables are 0, so all parts hold. 


el. a= tick,(t) 
This step changes both ctzme, and time, to t. 
2. a= tick,(t) 
This step changes both ctime, and time, to t. 
3. a@=vV 


The precondition on the time-passing steps of the clock subsystem (and thus on all of 
C) ensures that |s’.ctime, — s’.now| < ¢. Part 1 then gives the result. 


4. a=v 


The precondition on the time-passing steps of the clock subsystem (and thus on all of 
C) ensures that |s’.ctime, — s’.now| <¢. Part 2 then gives the result. 


5. Parts 3 and 4 directly implies the result. 
a 


Proof of Invariant 10.2 
e Initially upper, = G > 264+ Ul) > 2e. Since initially now = time, = time, = 0, all the 
invariants hold. 
el. a@= recover, 
This makes s’.mode, # rec but at the same time s’.upper, = s’.time,+ > s’.time, + 
2e+U. > s'.now+e+U', where the last inequality follows from Invariant 10.1 Part 4. 
a = increase-upper,,(t) 


As for the previous case, s’.upper, > s’.now+e+'. 
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a=v 

Assume s.mode, # rec. From the upper time bound on the class C%,.. consist- 
ing of all increase-upper,(t) actions we have s'.now < s.last(CG,.). The variable 
last(CG,2) is set to now + I, whenever a recover, step occurs (since then CG. be- 
comes enabled) or a increase-upper,(t) step occurs (since then increase-upper,.(t) be- 
comes reenabled). Now, since we assume s.mode, # rec, let now, and upper, . denote 
real time and upper, right after the last recover, or increase-upper,(t) step. Then 
s'.now < s.last(CG,.) = now + I, so, now) > s'.now — I). Now, from the recover, 
and increase-upper,,(t) cases above we finally get s’.upper, = upper, > now )te+l), > 
(s'.now —U)+e4 0. = s'.now +. 

Note: We are here actually departing from our normal way of proving invariants 
since we use more information, like now), than is available in s. What we could have 
done was to introduce a history variables now) that is set to now in recover, and 
increase-upper,(t) steps. We could then easily have proved the invariants 


If mode, # rec then last(C&,.) = nowo +l, and now < nowy + I, 
supper, > nowy +e +) 


from which the result would follow. 

We go through the same arguments but have chosen, for brevity, to avoid explicitly 
introducing the extra history variable. 

This part follows directly from Part 1 and Invariant 10.1 Part 3. 

This part follows directly from Part 1 and Invariant 10.1 Part 4. 


Proof of Invariant 10.3 


e Initially last, = time, = 0 and mode, = idle, so both parts hold. 


1. 


a € {choose_id(t), recover,, tick,(t)} 


All such steps clearly preserve this part. 


a = choose_id(t) 


Changes mode, to send but also explicitly sets s’.Jast, =t > s.last, > 0. 


Proof of Invariant 10.4 


Straightforward. 


Proof of Invariant 10.5 


Straightforward. 


Proof of Invariant 10.6 


Straightforward. 
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Proof of Invariant 10.7 


e Initially lower, = time, = last, = 0, so both parts of the invariant hold. 


el. 


No steps can make time, smaller, so we need only check the steps that make lower, 
bigger. 


a = recover, 


Then s’.lower, = s.upper, and s.upper, + 2€ < s.time,. Therefore, s’ lower, < 
s.time, — 2e < s.time, = s’.time,, where we have used Invariant 10.1 Part 5. 


a = increase-lower,(t) 


Then s’.lower, < s.time, — p < s.time, — (kl, + d+ 2€) < s.time, — 2€ < s.time, = 
s'.time,, where we again have used Invariant 10.1 Part 5. 


a = receive_pkt,,.(m, t) 


The only way for lower, to increase is for s’.lower, = t but then, since ((m,t),_) € s.sr, 
Invariants 10.6 Part 1 and 10.3 Part 1 imply that s’.lower, < s.last, < s.time, = 
s' .time,. 


a € {recover,, increase-lower,(t)} 


Same argument as for the previous part. 


a = receive_pkt,,.(m, t) 


Assume s’.last, < s’.time,. Since s.last, = s'.last, and s.time, = s’.time,, we also 
have s.last, < s.time,. The only way for lower, to increase is for s’.Jower, = t but 
then, since ((m,t),-) € s.sr, Invariants 10.6 Part 1 implies that s’.lower, < s.last, < 
s.time, = s'.time,. 


a = tick,(t) 


Assume s’.last, < s’.time,. From Invariant 10.3 Part 1 we have s.last, < s.time,. We 
consider cases: 


— s.last, < s.time, 
Then s.lower, < s.time, by the inductive hypothesis, so we have s’.lower, = 
s.lower, < s.lime, < s’.time,, as needed, where the last inequality follows from 
the definition of tick,(¢). 

— s.last, = s.time, 
Then since s.last, = s’.last < s’.time, we have s.time < s’.time. Since s’.lower, = 
s.lower,, and s.lower, < s.time, by Part 1, we have s’.lower < s’.time,, as needed. 


Proof of Invariant 10.8 


Straightforward. 


Proof of Invariant 10.9 


Straightforward. 


266 


C. Invariance Proofs 


Proof of Invariant 10.10 


e Initially deadline = oo and now = 0, and since mode, = idle we have bound = ov, so all 
parts hold. 


1. 


a = choose_id(t) 


Then s’.dast, = t. Let m= s'.current-msg,. 

If s.mode, = s’.mode, = rec then s’.deadline = s.deadline and the induction hypoth- 
esis Part 7 implies that s.deadline = oo, so we are done. 

So, assume s.mode, # rec. From the precondition of choose_id(t) we have t > 
s.last,. Now Invariants 10.5 Part 1 and 10.6 Part 1 imply, since s’.count,,(m,t) = 
s.count,,(m,t) and s’.rs = s.rs, that s'.count,,(m,t) = 0 and (m,t) ¢ packets(s’.sr). 
Now, since CG, becomes reenabled in s’ we have s‘.last(CG,, = s'.now +/,. Thus, 


s'.bound = s'.last(Cg,) + (k — 1—s'.count,,(m,t))l, +d 
= s'now+l,+(k-1)l,+d 


= s'.deadline 


That suffices. 


a = send_pkt,,.(m,t) 


We consider cases 
— (m,t) € packets(s.sr) 
Then s’.bound = s.bound since the mintime of the (p,t) packets does not change. 
Since also s’.deadline = s.deadline, the result follows. 
— (m,t) € packets(s.sr) 
* (m,t) €¢ packets(s’.sr) 
Then s'.count,,(m,t) = s.count,,(m,t) + 1. We now have 


s'.bound = s'.last(Ce,) + (k —1—s'.count,,(m,t))l, + d 
s'.now +1, + (k-—1- s'.count,,(m,t))l, +d 
= s'.now+(k—1-— s.count,,(m,t))l, +d 
<  s.last(CG,) + (k — 1— s.count,,(m,t))l, + d 


s.bound 
The induction hypothesis Part 1 now implies 
s' deadline = s.deadline > s.bound > s'.bound 
(m,t) € packets(s'.sr) 
Then s’.bound = d+ s'.now and 


* 


s.bound = s.last(CG,,)+(k—1—s.count,,(m,t))l, + d 
> s.last(Ce,) +d 
> s'.now+d 
= s'.bound 


where the first inequality follows from Invariant 10.5 Part 2 and the second 
inequality follows from facts that time cannot pass beyond any last(C’) variable 
and s’.now = s.now. 

The induction hypothesis Part 1 now implies 


s' deadline = s.deadline > s.bound > s'.bound 
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a = receive_pkt,,.(m, t) 

For such a step to change either bound or deadline, i.e., for such a step to be able to 
invalidate the invariant part under consideration, we must have s.mode, = send(= 
s'.mode,) and t = s.last, (= s'.last,). Invariant 10.6 Part 2 then implies that m = 
s.current-msg, (= s’.current-msqg, ). 

If s.deadline = oo, then also s’.deadline = oo and the result follows. 

So, assume s.deadline #4 oo. The induction hypothesis Part 7 then implies s.mode, 4 
rec. 

We now show that s.lower, <t < s.upper,. 

The lower bound follows from the induction hypothesis Part 6 and the fact that 
t = s.last,. 

For the upper bound we have from Invariants 10.2 Part 2 and 10.3 Part 1 that 
S.upper, > s.time, > s.last, = t. 

Then from the definition of receive_pkt,,(m,t) we see that s’.deadline = oo, and the 
result follows. 


a = receive_pkt,,(t, b) 

For such a step to be able to invalidate the invariant part under consideration, we 
must have s.mode, = send and s.last, = t. 

Then Invariant 10.6 Part 6 implies that s.last, =t < s.Jower,, but then the induction 
hypothesis Part 6 implies that s’.deadline = s.deadline = oo. That suffices. 


2. a= choose_id(t) 
Then Invariant 10.5 Part 1 and the definitions of bound and last(C@ ,) imply that 


8. bound = s'.now+1,4+(k-1)l, +d > s'.now 


a = send_pkt ,.(m,t) 
We consider cases 
— (m,t) € packets(s'.sr) 

* (m,t) € packets(s.sr) Then s’.bound = s.bound (uses the fact that Invari- 
ant 10.9 Part 1 implies that mintime((m, t), s’.sr) = mintime((m, t), s.sr)), so 
the result follows from the induction hypothesis. 

*« (m,t) € packets(s.sr) Then s’.bound = s'.now+d> s'.now. 

— (m,t) ¢ packets(s'.sr) Then s’.last(CG,) = s’.now +1,, so Invariant 10.5 Part 2 
implies 
s'. bound = s'.now + 1,+(k —1-—s'.count,,(m,t))l, +d > s'.now 


receive_pkt,.(m, t) 

For such a step to change bound we must have s.mode, = send, s.last, = t, and 
s.current-msg, = m. In all other cases the induction hypothesis immediately gives the 
result. 

The step removes ((m,t),t’), for some t’, from sr. If t’ #4 mintime((m,t), s.sr) then 
8'.bound = s.bound, and again the induction hypothesis gives the result. So, assume 
t! = mintime((m, t), s.sr). 


We consider cases 
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— (m,t) € packets(s'.sr 
Then mintime((m,t), s’.sr) > mintime((m,t), s.sr) which implies that s’.bound > 
s.bound and the result follows. 

— (m,t) € packets(s'.sr 
Then, since s’.last(CG ,) > s'.now we have (with a little help from Invariant 10.5 
Part 2) 


s'.bound = s'.last(Cg ,) + (k — 1 — s'.count,,(m,t))l, +d > s'.now 


a=V 
If s.mode, = s'.mode, # send, then s’.bound = oo and the result follows. So, assume 
s.mode, = s’.mode, = send 
Let m = s.current-msg, = s’.currnet-msg, and t = s.last, = s’.last,. We consider 
cases 
— (m,t) € packets(s.sr) 
Then ((m,t), mintime((m,t), s.sr)) € s.sr and from the precondition of the time 
passing steps of the channel sr we have s'.now < mintime((m,t),s.sr). Thus, 
since s’.sr = s.sr, 
s'.now < mintime((m,t), s.sr) < mintime((m, t), s'.sr) + d = s'.bound 
— (m,t) € packets(s.sr) 
Then, since s’.lasi(Cé ,) > s‘.now we have (with a little help from Invariant 10.5 
Part 2) 
s'.bound = s'.last(Cg ,) + (k — 1—s'.count,,(m,t))l, +d > s'.now 


This part follows directly from Parts 1 and 2. 
a = choose_id(t) 


If s.mode, = rec then s’.deadline = s.deadline = co, by the induction hypothesis 
Part 7, so the result follows. 


So, assume s.mode, < rec. Then s’.deadline = s'.now+kl,+d and s'.last, = s’.time,. 
Invariant 10.1 Part 3 then implies that s’.deadline < s’ last, +e+kl, +d. 


a = recover, 
Then the induction hypothesis Part 7 implies that s.deadline = oo, and since we have 
8’. deadline = s.deadline, the result follows. 

This part follows directly from Parts 3 and 4. 

a € {recover,, recover, } 


Then by the induction hypothesis Part 7 we have s’.deadline = s.deadline = oo. That 
suffices. 


a = choose_id(t) 


Then s' last, = s'.time, = s.time, > s.last,, by definition of choose_id. By Invari- 
ant 10.7 Part 2, s.lower, < s.time,. But since s’.lower, = s.lower, and s.time, = 
s' last,, we have s’.lower, < s'.last,, as needed. 
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a = increase-lower,(t) 
We only need to check such steps when s’.deadline = s.deadline £ w. 


By definition of increase-lower,(t), we have s’.lower, < s'.time, — p < s'.time, — 
(kl, + d + 2e€). It suffices to show that this is less than or equal to s’.last,. Since 
s'.deadline #4 o, Part 5 implies that s’.now < s’.last, +e+ kl, +d. By Invariant 10.1 
Part 4, we know that s’.time, < s’.now+e. Therefore, s’.time, < s'.dast,+kl,+d+2e. 
This suffices. 


a = receive_pkt,,.(m, t) 
This increases lower, if s.mode, # rec and s.lower, <t < s.upper,. 


If s.deadline = o0 then also s’.deadline = « and the result follows. 


So assume s.deadline # oo. Then induction hypothesis Part 7 implies that s.mode, = 
send. Now, if ¢ = s.last, then s’.deadline = oo and the result follows. If t 4 s.last,, 
then Invariant 10.6 Part 1 implies that t < s.last,. Then, since s’.Jower, = t and 
8’ last, = s.last,, we get s’.lower, < s'.last,, as needed. 


7. Straightforward except for the case a = receive_pkt,.,(t, b). 


a = receive_pkt,.,(t, b) 


This may invalidate the invariant by changing mode, to idle if we have t = s.last, 
and s.mode, = send. 


Invariant 10.6 Part 6 implies that s.dast, < s.lower,. From the induction hypothesis 
Part 6 we then get s.deadline = oo, and since s’.deadline = s.deadline the result 
follows. 


Proof of Invariant 10.12 
Straightforward. 
|| 


Proof of Invariant 10.13 
Straightforward. 
|| 


